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hlstone deacetylase - related gene and protein 
Field of the Invention 

This invention relates to a histone deacetylase gene and gene product. In particular, the 
5 invention relates to a protein that is highly homologous to known yeast histone deacetylase 1 
(hdal) class II histone deacetylases (HDACs), nucleic acid molecules that encode such a protein, 
antibodies that recognize the protein, and methods for diagnosing conditions related to abnormal 
HDAC activity, including, for example, abnormal cell proliferation, cancer, atherosclerosis, 
inflammatory bowel disease, host inflammatory or immune response or psoriasis. 

10 

Background of the Invention 

Histone acetylation is a major regulatory mechanism that modulates gene expression by 
altering the accessibility of transcription factors to DNA. Acetylation of histones is a reversible 

15 modification of the free E-amino group of lysine that occurs during the assembly of nucleosomes 
and during DNA synthesis. Changes in histone acetylation levels also occur during 
transcriptional activation and silencing. Acetylation of histones is generally associated with 
transcriptional activity, whereas deacetylation is associated with transcriptional repression. 
Histone acetylation levels result from an equilibrium between competing histone acetylases and 

20 deacetylases (Emiliani, S., Fischle, W., Van Lindt, C, Al-Abed, Y., and Verdin, E., Proc Nat. 
Acad. Sci., U. S. A., 95, 2795-2800 (1998). 

HDACs have been shown to play an important role in the regulation of transcription. 
HDACs function as components of complexes that are involved in transcriptional repression. 

25 This is mediated through interactions of HDACs with multi-protein complexes and requires 
deacetylase activity. HDAC complexes may contain the co-repressor mSin3 A (Kasten, M.M., 
Dorland, S., Stillman, DJ. Mol Cell Biol 17, 4852-4858 (1997)) and mSin3A-associated 
proteins (Zhang, Y., Iratni, R., Erdjument-Bromage, H., Tempst, P., Reinberg, D. Cell 89, 357- 
364 (1997); Zhang, Y., Sun, Z.W., Iratni, R., Erdjument-Bromage, H M Tempst, P., Hampsey, M., 

30 Reinberg, D. Mol Cell 1, 1021-103 1(1998)) silencing mediators NcoR (Nagy, L., H.- Y. Kao, 
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D. Chakravarti, R. J. Lin, C. A. Hassig, D. E. Ayer, S. L. Schreiber, and R. M. Evans (1997) Cell 
89, 373-380 and SMRT (Alland, L. et aL, Nature 387:49-55 (1997); Heinzel, T. et al., Nature 
387:43-8 (1997)), transcriptional repressors Rb (Brownell, J. E., Zhou, J., Ranalli, T., Kobayashi, 
R., Edmondson, D. G., Roth, S.Y., and Allis, C. D. (1996) Cell 84, 843-851), Rb-like proteins 
5 pl07 (Feneira, R., Magnaghi-Jaulin, L., Robin, P., Harel-Bellan, A., Trouche, D. (1998) Proc. 
Natl Acad. Sci. USA 95, 10493-10498) and pl30 (Stiegler, P., De Luca, A. Bagella, L., 
Giordano, A. (1998) Cancer Res. 389, 187-190), Rb-associated proteins (Nicolas, E., Morales, 
V., Magnaghi-Jaulin, L., Harel-Bellan, A., Richaid-Foy, H., Trouche, D. (2000) /. Biol, Chem. 
275, 9797-9804, Lai, A., Lee, J.M., Yang, W.M., DeCaprio, J.A., Kaelin, W.G. Jr., Seto, E., 

10 Branton, P.E. (1999) Mol. Cell. Biol. 19, 6632-6641), Mad/Max (Laherty, C, W.- M. Yang, J.- 
M. Sun, J. R. Davie, E. Seto, and R. N. Eisenman. (1997) Cell 89, 349-456), nuclear hormone 
receptors (Nagy, L., H.- Y. Kao, D. Chakravarti, R. J. Lin, C. A. Hassig, D. E. Ayer, S. L. 
Schreiber, and R. M. Evans. (1997) Cell 89, 373-380), nucleosome remodeling factors (Xue, Y., 
Wong, J., Moreno, G.T., Young, M.K., Cote, J., Wang, W. (1998) Mol. Cell. 2, 851-861), 

15 methyl-binding proteins (Fuks, R, Burgers, W.A., Brehm, A., Hughes-Davies, L., Kouzarides, T. 
(2000) Nat. Genet 24, 88-91, Nan, X., Ng, HH., Johnson, CA., Laherty CD., Turner, B.M., 
Eisenman, RJST., Bird, A. (1998) Nature 393, 386-389, Ghosh, A.K., Steele, R., Ray, RB. (1999) 
Biochem. Biophys. Res. Commun. 260, 405-409, Ng, EL H., Zhang, Y., Hendrich, B., Johnson, 

C. A., Turner, BM., Erdjument-Bromage, HL, Tempst, P., Reinberg, D., Bird, A. (1999) Nat. 
20 Genet. 23, 58-61), and DNA repair machinery proteins (Y arden, R.L, Brody, L.C. (1999) Proc. 

Natl. Acad. Sci. U. S. A. 96, 4983-4988, Cai, R.L., Yan-Neale, Y., Cueto, M.A., Xu, H., Cohen, 

D. (2000) J. Biol. Chem. 275, 27909-27916). Furthermore, HDAC1 has been found to bind 
directly to YY1 (Yang, W.- M., Inouye, C, Zeng, Y., Bearss, D., and Seto, E. (1996) Proc. Natl 
Acad. Sci. 93, 122845-12850) and Spl (Doetzlhofer, A., Rotheneder, H., Lagger, G., Koranda, 

25 M., Kurtev, V., Brosch, G., Wintersberger, E., Seiser, C. (1999) Mol. Cell Biol. 19, 5504-551 1) 
and HDACs 4 and 5 bind to MEF2 (Grozinger, C. M., and Schreiber, S. L. (2000) Proc. Natl 
Acad. Sci. 97, 7835-7840). In addition, HDACs have been found together in complexes (Eilers, 
A.L., Billin, AJ*., Liu, J., Ayer, D£. (1999) J Biol Chem 274, 32750-32756, Grozinger, C. M., 
and Schreiber, S. L. (2000) Proc. Natl Acad. Sci. 97, 7835-7840). 

30 
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Two distinct classes of yeast histone deacetylases have been identified based upon size 
and sequence. Yeast class I HDACs include Rpd3, Hoslp, and Hos2p. Class II contains yeast 
HDAlp. Furthermore, members of these two classes were found to form different complexes. 
Human HDACs have been classified based upon their similarity to yeast sequences. Class I 
5 human HDACs include HDACsl-3 and 8. Class II HDACs include HDACs 4-7. The 

deacetylase core of class I HDACs reside in the first -390 amino acids. Class II HDAC catalytic 
domains are located in the C-terminal of these peptides, with the exception of HDAC4 that 
contains a second catalytic domain in the N-terminus (Grozinger, C. M., Hassig, C. A., and 
Schreiber, S. L. (1999) Proc. Natl. Acad. Sci. U. S. A. 96, 4868-4873). 

10 

An important approach that has been used to study the function of chromatin acetylation 
is the use of specific inhibitors of histone deacetylase. Several classes of compounds have been 
identified that inhibit HDAC. Histone deacetylase inhibitors have been found to have anti- 
proliferative effects, including induction of Gl/S and G2/M cell cycle arrest, differentiation 

15 (Itazaki, H., K. Nagashima, K. Sugita, H. Yoshida, Y. Kawamura, Y. Yasuda, K. Matsumoto, K. 
Ishii, N. Uotani, H. Nakai, A. Terui, S. Yoshimatsu, Y. Ikenishi and Y. Nakagawa. (1990) /. 
Antibiot. 12, 1524-1532, Hoshikawa, Y., Kijima, M., Yoshida, M„ and Beppu, T. (1991) Agric. 
Biol Chem. 55, 1491-1497, Hoshikawa, Y., Kwon, H.- J., Yoshida, M., Horinouchi, S., and 
Beppu, T. (1994) Exp. Cell Res. 214, 189-197, Sugita, K., Koizumi, K., and Yoshida, H. (1992) 

20 Cancer Res. 52, 168-172, Yoshida, M., Y. Hoshikawa, K. Koseki, K. Mori and T. Beppu. (1990) 
J. of Antibiot. 43, 1 101-106, Yoshida, M., Nomura, S., and Beppu, T. (1987) Cancer Res. 47, 
3688-3691), and apoptosis (Medina, V., Edmonds, B., Young, G. P., James, R., Appleton, S., 
Zalewski, P. D. (1997) Cancer Res. 57, 3697-3707) of transformed and normal cells and reversal 
of transfonnation (Kwon, H. J., Owa, T., Hassig, C. A., Shimada, J., and Schreiber, S. (1998) 

25 Proc. Natl Acad. Sci. U. S. A. 95, 3356-3361, Kim, M.-S., Son, M.-W., Park, Y. L, and Moon, 
A. (2000) Cancer Lett. 157, 23-30). These effects, along with the presence of HDAC in 
complexes with fusions of unliganded retinoic acid receptors PML-RARa and PLZF-RARa 
indicate a role for HDACs in tumorigenicity (Grignani, F., De Matteis, S., Nervi, C, Tomassoni, 
L., Gelmetti, V., Cioce, M., Fanelli, M., Ruthardt, M., Ferrara, F. R, Zamir, L, Seiser, C, 

30 Grignani, F., Lazar, M. A., Minucci, S., Pelicci, P. G. (1998) Nature 391, 815-818, He, L. Z., 
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Guidez, F., Tribioli, C, Perazzi, D., Ruthardt, M., Zelent, A., Pandolfi, P. P. (1998) Nat. Genet., 
18, 126-35, Lin, R J., Nagy, L., Inoue, S., Shao, W., Miller, W. H. Jr and Evans, R. M. (1998) 
Nature 391, 811-814). Furthermore, histone deacetylase inhibitors, phenylbutyrate and 
trichostatin A have shown promise in the treatment of promyelocytic leukemia and several other 

5 HDAC inhibitors are being studied and are nearing the clinic (Byrd, J.C., Shinn, C, Ravi, R., 
WilUs, C.R., Waselenko, J.K., Flinn, LW., Dawson, N.A., Grever, M.R. (1999) Blood 94, 1401- 
1408, Kim, YJB., Lee, K.H., Sugita, K., Yoshida, M., Horinouchi, S. (1999) Oncogene 18, 2461- 
2470, Cohen, LA., Amin, S., Marks, P.A., Riflcind, R.A., Desai, D., Richon, VM. (1999) 
Anticancer Res. 19, 4999-5005). In addition, the HDAC inhibitor, butyrate was found to decrease 

10 expression of pro-inflammatory cytokines TNF-a, TNF-p, IL-6, and ILl-fJ. These effects are 
thought to result from inhibition of NFkB activation (Segain JP, Raingeard de la Bletiere D, 
Bourreille, A., Leray V., Gervois, N., Rosales, C, Ferrier, L., Bonnet, C, Blottiere, H.M., 
Galmiche, JP. (2000) Butyrate inhibits inflammatory responses through NFteqppaB inhibition: 
implications for Crohn's disease. Gut 47, 397-403) and its ability to inhibit histone deacetylases 

15 (Inan M.S., Rasoulpour, RJ., Yin, L., Hubbard, A.K., Rosenberg, D.W., Giardina, C. (2000). 
The luminal short-chain fatty acid butyrate modulates NF-kappaB activity in a human colonic 
epithelial cell line. Gastroenterology 118, 724-34). 

20 The discovery of the HDAC inhibitor trapoxin, made it possible to isolate the first human 

histone deacetylase, HDAC1, using an affinity matrix column to which a trapoxin-like molecule 
was bound (Taunton, J., Collins, J. L., and Schreiber, S. (1996)7. Am. Chem. Soc. 118, 10412- 
10422). Subsequently, seven other human HDAC enzyme isoforms were reported (Taunton, J., 
Hassig, C. A. and Schreiber, Si. (1996). Science 272, 408-411, Yang, W. m., Ihouye, C, Zeng, 

25 Y., Bearss, D., and Seto, D. (1996) Proc. Natl Acad. Sci. U.SJL 93, 12845-12850, Yang, W. M., 
Yao, y. L., Sun, J. M., Davie, J. R., and Seto, E. (1997). J. Biol Chem. 272, 28001-28007, 
Emiliani, S., Fischle, W., Van Lint, C, Al-Abed, Y., and Verdin, E. (1998). Proc. Natl Acad. 
Set USA. 95, 2795-27800). These 8 HDACs have been divided into class I ( HDACs 1-3 and 8 
similar to the yeast gene Rpd3) and class n HDACs (4-7 similar to yeast gene hdal (Grozinger, 

30 C. M., Hassig, C.A., arid Schrieber, S. L. (1999). Proc. Natl. Acad. Sci. U.SJV. 96, 4983-4988.) 
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based on sequence homology. Here we report the isolation and characterization of a potential 
new HDAC, referred to herein as HDAC9, which displays sequence similarity to the hdal class 
II HDACs . HDAC9 has characteristics that bridge HDAC class I and class n. 

5 

Summary of the Invention 

Hie present invention relates to histone deacetylases, in particular to a novel histone 
deacetylase HDAC9. 

In a first aspect, the invention provides an isolated polypeptide comprising an amino acid 

10 sequence as set forth in SEQ ID NO:l , SEQ ID NO 5 or SEQ ID NO 6 . Furthermore, the 

invention provides an isolated polypeptide consisting of an amino acid sequence as set forth in 
SEQ ID NO:l , SEQ ID NO 5 or SEQ ID NO 6. The amino acid sequence as set forth in SEQ ID 
NO:l ,SEQ ID NO 5 or SEQ ID NO 6 shows a considerable degree of homology to that of 
known members of the family of HDACs. For convenience, the polypeptide consisting of the 

15 amino acid sequence as set forth in SEQ ID NO:l SEQ ID NO 5 or SEQ ID NO 6 will be 
designated as histone deacetylase 9 or HDAC9. Such a polypeptide, or a fragment thereof, is 
expressed in various normal tissues, for example, HDAC9 was present in normal testes, stomach, 
spleen, small intestine, placenta, liver, kidney, colon, lung, heart, and brain, as an approximately 
3 kb transcript HDAC9 was not detected in muscle, but this lane also did not hybridize GAPDH 

20 (Figure 7). Fragments of the isolated polypeptide having an amino acid sequence as set forth in 
SEQ ID NO:l ,SEQ ID NO 5 or SEQ ID NO 6 will comprise polypeptides comprising from 
about 5 to 148 amino acids, preferably from about 10 to about 143 amino acids, more preferably 
from about 20 to about 100 amino acids, and most preferably from about 20 to about 50 amino 
acids. Such fragments also form a part of the present invention. Preferably, fragments will 

25 encompass the catalytic domain, which is predicted to exist between amino acid number 1 to 
390. In accordance with this aspect of the invention there are provided novel polypeptides of 
human origin as well as biologically, diagnostically or therapeutically useful fragments, variants 
and derivatives thereof, variants and derivatives of the fragments, and analogs of the foregoing. 
In a second aspect, the invention provides an isolated DNA comprising a nucleotide 

30 sequence that encodes a polypeptide as mentioned above. In particular, the invention provides 
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(1 ) an isolated DNA comprising the nucleotide sequence as set forth in SEQ ID NO:2; SEQ ID 
NO 7 or SEQ ID NO 8 (2) an isolated DNA comprising the nucleotide sequence set forth in SEQ 
ID NO:3; (3) an isolated DNA capable of hybridizing under high stringency conditions to the 
nucleotide sequence set forth in SEQ ID NO:3; and (4) an isolated DNA comprising the 
5 nucleotide sequence set forth in SEQ ID NO:4. Also provided are nucleic acid sequences 

comprising at least about 15 bases, preferably at least about 20 bases, more preferably a nucleic 
acid sequence comprising about 30 contiguous bases of SEQ ID NO:2 , SEQ ID NO 7 or SEQ 
ID NO 8or SEQ ID NO:3. Also within the scope of the present invention are nucleic acids that 
are substantially similar to the nucleic acid with the nucleotide sequence as set forth in SEQ ID 

10 NO:2, SEQ ID NO 7 or SEQ ID NO 8 or SEQ ID NO:3. In a preferred embodiment, the 

isolated DNA takes the form of a vector molecule comprising at least a fragment of a DNA of 
the present invention, in particular comprising the DNA consisting of a nucleotide sequence as 
set forth in SEQ ID NO:2, SEQ ID NO 7 or SEQ ID NO 8 or SEQ ID NOS. 

A third aspect of the present invention encompasses a method for the diagnosis of 

15 conditions associated with abnormal regulation of gene expression which includes, but is.not 
limited to, conditions associated with abnormal cell proliferation, cancer, atherosclerosis, 
inflammatory bowel disease, or psoriasis in a human which comprises detecting abnormal 
transcription of messenger RNA transcribed from the natural endogenous human gene encoding 
the novel polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l ,SEQ ED 

20 NO 5 or SEQ ID NO 6 in an appropriate tissue or cell from a human, wherein such abnormal 

transcription is diagnostic of the human's affliction with such a condition. In particular, the said 
natural endogenous human gene encoding the novel polypeptide consisting of the amino acid 
sequence set forth in SEQ ID NO:l ,SEQ ID NO 5 or SEQ ID NO 6 comprises the genomic 
nucleotide sequence set forth in SEQ ID NO:4. In one embodiment of the present invention, the 

25 diagnostic method comprises contacting a sample of said appropriate tissue or cell or contacting 
an isolated RNA or DNA molecule derived from that tissue or cell with an isolated nucleotide 
sequence of at least about 15-20 nucleotides in length that hybridizes under high stringency 
conditions with the isolated nucleotide sequence encoding the novel polypeptide having an 
amino acid sequence set forth in SEQ ID NOs:l., 5 or 6 
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Another embodiment of the assay aspect of the invention provides a method for the 
diagnosis of a condition associated with abnormal HDAC9 activity in a human, which comprises 
measuring the level of deacetylase activity in a certain tissue or cell from a human suffering from 
such a condition, wherein the presence of an abnormal level of deacetylase activity, relative to 
5 the level thereof in the respective tissue or cell of a human not suffering from a condition 
associated with abnormal HDAC activity, is diagnostic of the human's suffering from said 
condition. 



In accordance with one embodiment of this aspect of die invention there are provided 
10 anti-sense polynucleotides that can regulate transcription of the gene encoding the novel 
HDAC9; in another embodiment, double stranded RNA is provided that can regulate the 
transcription of the gene encoding the novel HDAC9, 



Another aspect of the invention provides a process for producing the aforementioned 

15 polypeptides, polypeptide fragments, variants and derivatives, fragments of the variants and 
derivatives, and analogs of the foregoing. In a preferred embodiment of this aspect of the 
invention there are provided methods for producing the aforementioned HDAC9 comprising 
culturing host cells having incorporated therein an expression vector containing an exogenously- 
derived nucleotide sequence encoding such a polynucleotide under conditions sufficient for 

20 expression of the polypeptide in the host cell, thereby causing expression of the polypeptide, and 
optionally recovering the expressed polypeptide. In a preferred embodiment of this aspect of the 
present invention, there is provided a method for producing polypeptides comprising or 
consisting of an amino acid sequence as set forth in SEQ ID NOs:l, 5 or 6 which comprises 
culturing a host cell having incorporated therein an expression vector containing an exogenously- 

25 derived polynucleotide encoding a polypeptide comprising or consisting of an amino acid 

sequence as set forth in SEQ ID NOs:l, 5 or 6 under conditions sufficient for expression of such 
a polypeptide in the host cell, thereby causing the production of an expressed polypeptide, and 
optionally recovering the expressed polypeptide. Preferably, in any of such methods the 
exogenously derived polynucleotide comprises or consists of the nucleotide sequence set forth in 

30 SEQ ID NOs:2, 7 or 8 the nucleotide sequence set forth in SEQ ID N03, or the nucleotide 
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sequence set forth in SEQ ID NO:4. In accordance with another aspect of the invention there are 
provided products, compositions, processes and methods that utilize the aforementioned 
polypeptides ^ polynucleotides for, inter alia, research, biological, clinical and therapeutic 
purposes. 

5 

In certain additional preferred embodiments of this aspect of the invention there is 
provided an antibody or a fragment thereof which specifically binds to a polypeptide that 
comprises die amino acid sequence set forth in SEQ ID NOs:l , 5 or 6 i.e., all HDAC9 variants. 
In certain particularly preferred embodiments in this regard, the antibodies are highly selective 
10 for human HDAC9 polypeptides or portions of human HDAC9 polypeptides. 

In a further aspect, an antibody or fragment thereof is provided that binds to a fragment 
or portion of the amino acid sequence set forth in SEQ ID NOs:l, 5 or 6. 

15 In another aspect, methods of treating a condition in a subject, wherein the condition is 

associated with abnormal HDAC9 gene expression, an increase or decrease in the presence of 
HDAC9 polypeptide in a subject, or an increase or decrease in the activity of HDAC 9 
polypeptide, by the administration of an effective amount of an antibody that binds to a 
polypeptide with the amino acid sequence set out in SEQ ID NOstl , 5 or 6., or a fragment or 

20 portion thereof to the subject are provided. Also provided are methods for the diagnosis of a 
disease or condition associated with abnormal HDAC9 gene expression or an increase or 
decrease in the presence of the HDAC9 in a subject, or an increase or decrease in the activity of 
HDAC 9 polypeptide, which comprises utilizing conventional methodologies, including, for 
example, the H4 histone assay that was previously described (Inokoshi, J., Katagiri, M., Arima, 

25 S., Tanaka, H., Hayashi, M., Kim, Y.-B., Furumai, R., Yoshida, M„ Horinouchi, S., Omura, S. 
(1999) Biochem. Biophys. Res. Com. 256, 372-376.). 

In yet another aspect, the invention provides host cells which can be propagated in vitro, 
preferably vertebrate cells, in particular mammalian cells, or bacterial cells, which are capable 
30 upon growth in culture of producing a polypeptide that comprises the amino acid sequence set 
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forth in SEQ ID NOs:l, 5 or 6 or fragments thereof, where the cells contain transcriptional 
control DNA sequences, where the transcriptional control sequences control transcription of 
RNA encoding a polypeptide with the amino acid sequence according to SEQ ID NOs:l , 5 or 6. 
or fragments thereof. This includes, but is not limited to, the propagation of HDAC9 in a 
5 plasmid and the production of DNA, RNA or protein in human or insect cells or bacteria using 
the endogenous HDAC9 promoter or any other transcriptional control sequence. 



In yet another aspect of the present invention there are provided assay methods and kits 
comprising the components necessary to detect above-normal expression of polynucleotides 

10 encoding a polypeptide comprising an amino acid sequence as set forth in SEQ ID NOs:l , 5 or 6. 
, or polypeptides comprising an amino acid sequence set forth in SEQ ID NOs:l, 5 or 6. , or 
fragments thereof, in body tissue samples derived from a patient, such kits comprising e.g., 
antibodies that bind to a polypeptide comprising an amino acid sequence set forth in SEQ ID 
NOs:l , 5 or 6 or to fragments thereof, or oligonucleotide probes that hybridize with 

15 polynucleotides of the invention. In a preferred embodiment, such kits also comprise 
instructions detailing the procedures by which the kit components are to be used. 

In another aspect, the invention is directed to use of a polypeptide comprising an amino 
acid sequence set forth in SEQ ID NOs: 1, 5 or 6. or fragment thereof, polynucleotide encoding 
20 such a polypeptide or a fragment thereof, or antibody that binds to said polypeptide comprising 
an amino acid sequence set forth in SEQ ID NOs:l , 5 or 6. or a fragment thereof in the 
manufacture of a medicament to treat diseases associated with abnormal HD AC activity or gene 
expression. 



25 Another aspect is directed to pharmaceutical compositions comprising a polypeptide 

comprising or consisting of an amino acid sequence set forth in SEQ ID NOs:l, 5 or 6. or 
fragment thereof, a polynucleotide encoding such a polypeptide or a fragment thereof, or 
antibody that binds to such a polypeptide or a fragment thereof, in conjunction with a suitable 
pharmaceutical carrier, excipient or diluent, for the treatment of diseases associated with 

30 abnormal HDAC activity or gene expression. 



9 



WO 02/50285 



PCT/EP01/14928 



In another aspect, the invention is directed to methods for the identification of molecules 
that can bind to a polypeptide comprising an amino acid sequence set forth in SEQ ID NOs:l, 5 
or 6. and/or modulate the activity of a polypeptide comprising an amino acid sequence set forth 
in SEQ ID NOs:l , 5 or 6. or molecules that can bind to nucleic acid sequences that modulate the 
transcription or translation of a polynucleotide encoding a polypeptide comprising an amino acid 
sequence set forth in SEQ ID NOs:l, 5 or 6. Such methods are disclosed in, e.g., U.S. Patent 
Nos. 5,541,070; 5,567,317; 5,593,853; 5,670,326; 5,679,582; 5,856,083; 5,858,657; 5,866,341 ; 
5,876,946; 5,989,814; 6,010,861; 6,020,141; 6,030,779; and 6,043024, all of which are 
incorporated by reference herein in their entirety. Molecules identified by such methods also fall 
within the scope of the present invention. 

Li a related aspect, the invention is directed to use of the novel HDAC9 to identify 
associated proteins in HDAC biologically relevant complexes. At present, the proteins that 
associate with HDAC9 are not known. However, these may be characterized by determining 
whether HDAC9 associates with proteins that have been previously shown to interact with other 
HDACs (see Introduction). For example, components of HDAC9 complexes may be determined 
using conventional methods, including co-immunoprecipitation (see Example 9). 

In yet another aspect, the invention is directed to methods for the introduction of nucleic 
acids of the invention into one or more tissues of a subject in need of treatment with the result 
that one or more proteins encoded by the nucleic acids are expressed and or secreted by cells 
within the tissue. 

Other objects, features, advantages and aspects of the present invention will become 
apparent to those of skill from the following description. It should be understood, however, that 
the following description and the specific examples, while indicating preferred embodiments of 
the invention, are given by way of illustration only. Various changes and modifications within 
the spirit and scope of the disclosed invention will become readily apparent to those skilled in the 
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art from reading the following description and from reading the other parts of the pxesent 
disclosure. 

Brief Description of the Drawings 
5 Figure 1 shows the 1 156 bp open reading frame that was identified using GENFAM 

(proprietary software) and used to search databases for the complete HDAC9 cDNA sequence. 
The respective ORF (SEQ ID N03) starts at nucleotide position no. 1 and ends at nucleotide 
position no. 1156. 

10 Figures 2A and 2B show the full length cDNA sequence (SEQ ID N02) of HDAC9 and 

the amino acid sequence (SEQ ID NO:l), respectively. The fall length cDNA sequence starts at 
nucleotide position no. 1 and ends at nucleotide position 2022. 

Figure 3 shows the genomic DNA sequence in silico (AL022328) (SEQ ID NO:4), 
15 aligned with the sequence of clone 198929/HDAC9. The alignment was produced using 
proprietary software (Novartis Pharmaceuticals, Summit, NJ). 

Figure 4 is a depiction of the alignment of HDAC9 predicted peptide and S. pombe Hdal 
peptide. The query is HDAC9 peptide and the subject is S. pombe Hdal peptide. The alignment 
20 was produced using Clustalw algorhithm (Higgins, D.G., Thompson, JD., Gibson, T J. (1996) 
Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383-402). 

Figure 5 shows the alignment of HDAC1 and HDAC9vl and locations of the putative 
catalytic domain amino acids and Rb-binding domain. Catalytic domain amino acids are boxed 
25 and putative Rb domain amino acids are contained within crosshatched boxes. The alignment 
was produced using Clustalw algorhithm (Higgins, D.G., Thompson, J J)., Gibson, TJ. (1996) 
Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383-402). 
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Figure 6 shows the alignment of HDACs l-9vl . The alignment was produced using 
Clustalw algorhithm (Higgins, D.G., Thompson, JJD., Gibson, T J. (1996) Using CLUSTAL for 
multiple sequence alignments. Methods Enzymol 266, 383-402). 

Figure 7 shows the Northern analysis of HDAC9. (A) Northern blot analysis of the 
distribution of HDAC9 in normal human tissues. GAPDH was hybridized to the same blot as a 
control for RNA loading. (B) Northern blot analysis of HDAC9 in matched tumor and normal 
tissues. GAPDH was hybridized to the same blot as a control for RNA loading. 

Figure 8 shows Real Time PCR analysis of the distribution of HDAC9 in normal human 
tissues and cell lines relative to 18S ribosomal RNA. RNA from the human lung carcinoma cell 
line, A549 was used as an internal control. 

Figure 9 shows the alignment of HDAC9vl with class It HDACs (HDACs 4,5,6, 7). The 
alignment was produced using Clustalw algorhithm (Higgins, D.G., Thompson, JJD., Gibson, 
T J. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383- 
402). Catalytic domain amino acids are boxed. 

Figure 10 shows the alignment of HDAC9vl with class I HDACs (HDACs 1,2,3,8). The 
alignment was produced using Clustalw algorhithm (Higgins, D.G., Thompson, JD., Gibson, 
T J. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383- 
402). Catalytic domain amino acids are boxed. 

Figure 11 There are threee HDAC9 sequence variants (HDAC9vl, HDAc9v2, and 
HDAC9v3). HDAC9vl and HDA9v2 were found by searching the human EST database and 
HDAC9 v3 was found as a predicted transcript in the Celera Sequence database. (A) shows an 
alignment of the 3 HDAC9 variant peptide sequences. (B) shows a schematic of class I and class 
II HD AC peptide sequences. Catalytic domains are in filled boxes and putative LXCXE motifs 
are in open boxes (C) is a schematic of the genomic structures of HDAC9vl and HDAC9v2. 
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Exons are shown as filled boxes and introns are shown as lines between the filled boxes. Lengths 
of boxes and lines represent the lengths of exons and introns. 



Figure 12 shows that HDAC9 is an enzymatically active histone deacetylase. (A) 
5 HDAC9 catalytic activity is comparable to the activity of HDAC3 and HDAC4. (B) shows that 
HDAC1 was more efficient than HDAC3, HDAC4, and HDAC9 at deacetylating the histone 
substrate in this assay. 

Figure 13 shows that HDAC9 is a nuclear protein and shows that HDAC9-flag is in vitro 
10 translated. 



Figure 14 shows DNA and peptide sequences for HDAC9v3 and HDAC9v2. 
Detailed Description of the Invention 

15 

All patent applications, patents and literature references cited herein are hereby 
incorporated by reference in their entirety. 

In practicing the present invention, many conventional techniques in molecular biology, 
microbiology, and recombinant DNA are used. These techniques are well known and are 

20 explained in, for example, Current Protocols in Molecular Biology, Volumes I, n, and HI, 1997 
(F. M. Ausubel ed.); Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; DNA Cloning: A 
Practical Approach, Volumes I and EE, 1985 (D. N. Glover ed.); Oligonucleotide Synthesis, 1984 
(M. L. Gait ed.); Nucleic Acid Hybridization, 1985, (Hames and Higgins); Transcription and 

25 Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture, 1986 (R. I. Freshney ed.); 
Immobilized Cells and Enzymes, 1986 (IRL Press); Perbal, 1984, A Practical Guide to 
Molecular Cloning; the series, Methods in Enzymology (Academic Press, Inc.); Gene Transfer 
Vectors for Mammalian Cells, 1987 (J. H. Miller and M. P. Calos eds., Cold Spring Harbor 
Laboratory); and Methods in Enzymology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, 

30 eds., respectively). 
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The following abbreviations used throughout the disclosure are listed herein below: 
histone deacetylase (HDAC), histone deacetylase-like protein (HDLP) 

In its broadest sense, the term "substantially similar", when used herein with respect to a 
5 nucleotide sequence, means a nucleotide sequence corresponding to a reference nucleotide 
sequence, wherein the corresponding sequence encodes a polypeptide having substantially the 
same structure and function as the polypeptide encoded by the reference nucleotide sequence, 
e.g. where only changes in amino acids not affecting the polypeptide function occur. Desirably 
the substantially similar nucleotide sequence encodes the polypeptide encoded by the reference 

10 nucleotide sequence. The percentage of identity between the substantially similar nucleotide 

sequence and the reference nucleotide sequence desirably is at least 80%, mote desirably at least 
85%, preferably at least 90%, more preferably at least 95%, still more preferably at least 99%. 
Sequence comparisons are carried out using Clustalw (see, for example, Higgins, D.G. et al. 
Methods Enzymol. 266383-402 (1996)). Clustalw alignments were performed using default 

15 parameters. 

A nucleotide sequence "substantially similar" to reference nucleotide sequence 
hybridizes to the reference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaPOt, 1 mM EDTA at 50°C with washing in 2X SSC, 0.1% SDS at 50°C, more desirably in 

20 7% sodium dodecyl sulfete (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in IX 
SSC, 0.1% SDS at 50°C, more desirably still in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaPO* 1 mM EDTA at 50°C with washing in 0 SX SSC, 0.1% SDS at 50°C, preferably in 7% 
sodium dodecyl sulfete (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0.1X SSC, 
0.1% SDS at 50°C, more preferably in 7% sodium dodecyl sulfete (SDS), 0.5 M NaPO* 1 mM 

25 EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 65°C, yet still encodes a functionally 
equivalent gene product 

"Elevated transcription of mRNA" refers to a greater amount of messenger RNA 
transcribed from the natural endogenous human gene encoding the novel polypeptide of the 
30 present invention present in an appropriate tissue or cell of an individual suffering from a 
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condition associated with abnormal HDAC9 activity than in a subject not suffering from such a 
disease or condition; in particular at least about twice, preferably at least about five times, more 
preferably at least about ten times, most preferably at least about 100 times the amount of mRNA 
found in corresponding tissues in humans who do not suffer from such a condition. Such 
5 elevated level of mRNA may eventually lead to increased levels of protein translated from such 
mRNA in an individual suffering from a condition associated with abnonnal cellular 
proliferation as compared with a healthy individual. It is also understood that "elevated 
transcription of mRNA" may refer to a greater amount of messenger RNA transcribed from 
genes the expression of which is modulated by HD AC9 either alone or in combination with other 
10 molecules. 

A "host cell," as used herein, refers to a prokaryotic or eukaryotic cell that contains 
heterologous DNA that has been introduced into the cell by any means, e.g., electroporation, 
calcium phosphate precipitation, microinjection, transformation, viral infection, and the like. 
"Heterologous" as used herein means "of different natural origin" or represent a non- 
15 natural state. For example, if a host cell is transformed with a DNA or gene derived from 
another organism, particularly from another species, that gene is heterologous with respect to 
that host cell and also with respect to descendants of the host cell which carry that gene. 
Similarly, heterologous refers to a nucleotide sequence derived from and inserted into the same 
natural, original cell type, but which is present in a non-natural state, e.g. a different copy 
20 number, or under the control of different regulatory elements. 

A "vector" molecule is a nucleic acid molecule into which heterologous nucleic acid may 
be inserted which can then be introduced into an appropriate host cell. Vectors preferably have 
one or more origin of replication, and one or more site into which the recombinant DNA can be 
inserted. Vectors often have convenient means by which cells with vectors can be selected from 
25 those without, e.g., they encode drug resistance genes. Common vectors include plasmids, viral 
genomes, and (primarily in yeast and bacteria) "artificial chromosomes." 

"Plasmids" generally are designated herein by a lower case p preceded and/or followed 
by capital letters and/or numbers, in accordance with standard naming conventions that are 
familiar to those of skill in the art Starting plasmids disclosed herein are either commercially 
30 available, publicly available on an unrestricted basis, or can be constructed from available 
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plasmids by routine application of well known, published procedures. Many plasmids and other 
cloning and expression vectors that can be used in accordance with the present invention are well 
known and readily available to those of skill in the art. Moreover, those of skill readily may 
construct any number of other plasmids suitable for use in the invention. The properties, 
5 construction and use of such plasmids, as well as other vectors, in the present invention will be 
readily apparent to those of skill from the present disclosure. 

The term "isolated" means that the material is removed from its original environment 
(e.g., the natural environment if it is naturally occurring). For example, a naturally-occurring 
polynucleotide or polypeptide present in a living animal is not isolated, but the same 
10 polynucleotide or polypeptide, separated from some or all of the coexisting materials in the 
natural system, is isolated, even if subsequently reintroduced into the natural system. Such 
polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be 
part of a composition, and still be isolated in that such vector or composition is not part of its 
natural environment 

15 As used herein, the term "transcriptional control sequence" refers to DNA sequences, 

such as initiator sequences, enhancer sequences, and promoter sequences, which induce, repress, 
or otherwise control the transcription of protein encoding nucleic acid sequences to which they 
are operably linked. 

As used herein, "human transcriptional control sequences" are any of those 

20 transcriptional control sequences normally found associated with the human gene encoding the 
novel HDAC9 polypeptide of the present invention as it is found in the respective human 
chromosome. It is understood that the term may also refer to transcriptional control sequences 
normally found associated with human genes the expression of which is modulated by HD AC9 
either alone or in combination with other molecules. 

25 As used herein, "non-human transcriptional control sequence" is any transcriptional 

control sequence not found in the human genome. 

The term "polypeptide" is used interchangeably herein with the terms "polypeptides" and 
"protein(sy\ 

As used herein, a "chemical derivative" of a polypeptide of the invention is a polypeptide 
30 of the invention that contains additional chemical moieties not normally a part of the molecule. 
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Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The 
moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are 
disclosed, for example, in Remington's Pharmaceutical Sciences, 16th ed., Mack Publishing Co., 
5 Easton,Pa. (1980). 

As used herein, "HDAC9" refers to the amino acid sequences of substantially purified 
HDAC9 obtained from any species, particularly mammalian, including bovine, ovine, porcine, 
murine, equine, and preferably human, from any source, whether natural, synthetic, semi- 

l o synthetic, or recombinant 

As used herein, "HDAC activity", including "HDAC9 activity" refers to the ability of an 
HDAC polypeptide to deacetylate histone proteins, including 3 H-labeled H4 histone peptide. 
Such activity may be measured according to conventional methods, for example as described in 
Inokoshi, J., Katagiri, M., Arima, S., Tanaka, H., Hayashi, M., Kim, Y.-B., Furumai, R., 

15 Yoshida, M., Horinouchi, S., and Omura, S. (1999) Biochem. Biophys. Res. Com. 256, 372-376. 
A biologically "active" protein refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. 

t 

The term "agonist", as used herein, refers to a molecule which when bound to HDAC9, 
20 causes a change in HDAC9 which modulates the activity of HDAC9.. Agonists may include 
proteins, nucleic acids, carbohydrates, or any other molecules that bind to HDAC9. 

The terms "antagonist" or "inhibitor" as used herein, refer to a molecule which when 
bound to HDAC9, blocks or modulates the biological activity of HDAC9. Antagonists and 
25 inhibitors may include proteins, nucleic acids, carbohydrates, or any other molecules, natural or 
synthetic that bind to HDAC9. 

HDAC9 was identified using proprietary computer software called GENFAM to search 
for new human sequences that are related to histone deacetylases in the Celera Human Genome 
30 Database, Incyte UFESEQ® database and the public High Throughput Genomic database. An 
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1 156 bp open reading frame (ORF) was identified and used to search a database of sequenced 
clones from pan-tissue and dorsal root ganglion cDNA libraries. Four clones were found to 
contain the ORF (M6, K10, P3, F23), two from each library. Of these clones, M6, from the pan- 
tissue library was determined to be the most complete cDNA as a result of sequence analysis and 
5 in vitro translation. BLAST (Altshul S.F. et al Nucleic Acid Res 25:3389-402 (1997)) was used 
to search the Genbank database using cDNA clone M6. Genomic sequence AL022328 was found 
to contain exons that were identical in sequence to the M6 cDNA. A Clustalw alignment of the 
antisense sequence of HDAC9 (2022 to 8) with genomic sequence AL022328 is shown in Figure 
3. The first 7 bases of the HDAC9 predicted cDNA are not aligned, presumably because they 

10 occur following the next intron and this sequence was probably too short for the software to 
determine an alignment The sequence of cDNA clone M6 was confirmed by automated DNA 
sequencing (ACGT, Inc., Northbrook, IL). Based upon the predicted cDNA sequence from 
genomic sequence AL022328, 44 bases were missing from the N-terminus of M6. This sequence 
was subsequently added by PCR. 

15 The fall length cDNA for HDAC9 predicts a protein of 673 amino acids. The HDAC9 

cDNA sequence is 2022 base pairs in length. In order to determine the percent similarity of 
HDAC9 to other known HDACs, a Clustalw multiple sequence alignment was performed using 
complete peptide sequences for HDACs 1-9. HDAC9 is most similar in peptide sequence to 
human HDAC6 at 37%. The Clustalw alignment of HDAC9 with class II HDACs is shown in 

20 Figure 9. HDAC9 was also 40% similar to a yeast class II sequence hdal from S. pombe. The 
Clustalw alignment of human HDAC9 and S. pombe is shown in Figure 4. HDAC9 was less 
similar to class I HDACs (<18%). The Clustalw alignment of HDAC9 to class I HDACs is 
shown in Figure 10. HDAC9 possesses a putative catalytic domain which encompasses 
approximately 317 aa (-6 to 323) based upon alignments of HDAC9 with the putative catalytic 

25 domains of all of the other known HDACs. To identify the catalytic domain of HDAC9, 

Clustalw alignments were performed separately using HDAC9 complete peptide and catalytic 
domain sequences from class I HDACs (1 -3 and 8) or class n HDACs (4-7). 1 3 amino acids 
were previously shown to confer deacetylase activity, based upon inactivation by single amino 
acid mutations and the three dimensional structure formed by a complex of HDAC-like protein 

30 (HDLP), Zn2+ and HDAC inhibitors (Finnin, M. S., Doniglan, J. R., Cohen, A., Richon, V. M., 
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Rifkind, R. a, Marks, P. A., Breslow, R., and Pavletich, N. P. (1999) Structures of a histone 
deacetylase homologue bound to TSA and SAHA inhibitors. Nature 401, 188-193). These 13 
amino acids include Pro 22, His 131, His 132, Gly 140, Phe 141, Asp 166, Asp 168, His 170, 
Asp 173, Phe 198, Asp 258, Leu 265, and Tyr 297. 12 out of 13 of these amino acids are 
5 conserved in HDAC9. The amino acid that is not conserved is Leu 265. This hydrophobic 
residue forms part of the TS binding pocket and is replaced in HDAC9 with Glu at amino acid 
272. Leu 265 is replaced with Met in HDAC8 and Lys in HDAC 6 domain 1. This suggests that 
this residue is not highly conserved and need not be identical to other HDACs. The second 
residue that differs from HDLP, HDAC1 , and HDAC2, Asp 173 is substituted with Gin at 

10 position 177 in HDAC9, a difference that is also present in the HDAC6 catalytic domain 1 . 

Furthermore, Asp 173 is substituted with Asn in HDACs 4,5, 6 (domain 2), and 7. This evidence 
suggests that these Asp 173 substitutions do not affect HDAC activity. 

An amino acid sequence motif was previously found to be important for the binding of 
HDACs 1 and 2 to retinoblastoma protein (Rb). Complexes of HDACs 1 and 2 and Rb induce 

15 repression of E2F responsive promoters (Brehm, A., Miska, E. A., McCance, D. J., Reid, J. L., 
Bannister, A J., and Kouzarides, T. (1998) Nature 391, 597-601). An Rb-binding motif fits the 
sequence model LXCXE, where "X" can be any amino acid. The LXCXE domain has been 
found to be dispensible for growth suppression function of Rb, but is necessary for HDAC 
binding (Chen, T.-T. and Wang, J. Y. J. (2000) Mol Cell Biol 20, 5571-5580). The Rb-binding 

20 domain that was previously determined in HDAC1 is located from amino acid 414 to amino acid 
419 and is the sequence IACEE. So far, it has not been determined whether other HDACs are 
capable of binding to Rb. However, HDAC 9 contains a putative Rb-binding motif, LSCIL, that 
aligned with HDAC1 IACEE and is located between amino acids 560 and 564. Co- 
immunopiecipitation of HDAC9 with Rb is one strategy that may be used to validate the 

25 function of this motif in HDAC9. 

As a member of the HDAC family, HDAC9 could form biologically relevant complexes 
with proteins and display functions that have been described for other HDACs. For example, it is 
likely to be involved in the regulation of transcription as a component of complexes that are 
involved in transcriptional repression that is mediated through interactions of HDACs with 

30 multi-protein complexes and which requires deacetylase activity. Thus, increased activity or 
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expression of HDAC9 may be associated with numerous pathological conditions, including but 
not limited to, abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, 
host inflammatory or immune response, or psoriasis. 

5 Thus, the DNA/amino acid sequence and predicted structure of HDAC9 will be useful for 

designing agents (e.g. antagonists or inhibitors) useful to ameliorate conditions associated with 
abnormal HDAC activity. These may include, for example, antiproliferative or 
antiinflammatory agents either through the use of small molecules or proteins (e.g. antibodies) 
directed against it or associated proteins in HDAC transcription repressor complexes, hi 
10 addition, protein derived from the HDAC9 sequence may also be used as a therapeutic to modify 
host cell proliferative or inflammatory responses. 

To determine the expression pattern of the novel polypeptide, a panel of mRNAs from a 
variety of human tissues is subjected to Northern analysis. Data indicate that HDAC9 is 
15 expressed in human tissues, being detectable in brain, colon, heart, kidney, liver, placenta, small 
intestine, spleen, stomach and testes. Thus, HDAC9 represents a transcribed gene. 

Therefore, in one aspect, the present invention relates to a novel histone deacetylase 
(HDAC). As outlined above, HDAC9 is clearly a member of the HDAC family since it is highly 
similar to other HDAC proteins in the hdal class II HDACs. It also shares many similarities 
20 with the HDAC family. 

The present invention relates to an isolated polypeptide comprising the amino acid 
sequence set forth in SEQ ID NO: 1 . For example, such a polypeptide may be a fusion protein 
including die amino acid sequence of the novel HDAC9. In another aspect the present invention 
relates to an isolated polypeptide consisting of the amino acid sequence set forth in SEQ ID 
25 NO: 1 , which is, in particular, the novel HDAC9. 

The invention includes nucleic acid or nucleotide molecules, preferably DNA molecules, 
in particular encoding the novel HDAC9. Preferably, an isolated nucleic acid molecule, 
preferably a DNA molecule, of the present invention encodes a polypeptide comprising the 
amino acid sequence set forth in SEQ ID NO:l SEQ ID NO 5 or SEQ ID NO 6. Likewise 
30 preferred is an isolated nucleic acid molecule, preferably a DNA molecule, encoding a 
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polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l , SEQ ID NO 5 or 
SEQ ID NO 6. Such a nucleic acid or nucleotide, in particular such a DNA molecule, preferably 
comprises a nucleotide sequence selected from the group consisting of (1) the nucleotide 
sequence as set forth in SEQ ID NO:2„ 7 or 8 which is the complete cDNA sequence encoding 
5 the polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l , 5 and 6, 

respectively, (2) the nucleotide sequence set forth in SEQ ID NO:3, which coiTesponds to the 
open reading frame of the cDNA sequence set forth in SEQ ID NO:2; (3) a nucleotide sequence 
capable of of hybridizing under high stringency conditions to a nucleotide sequence set forth in 
SEQ ID NO:3; and (4) the nucleotide sequence set forth in SEQ ID NO:4, which corresponds to 

10 the endogenous genomic human DNA encoding the polypeptide consisting of the amino acid 
sequence set forth in SEQ ID NO:l . Such hybridization conditions may be highly stringent or 
less highly stringent, as described above. In instances wherein the nucleic acid molecules are 
deoxyoligonucleotides ("oligos"), highly stringent conditions may refer, e.g., to washing in 6X 
SSC/0.05% sodium pyrophosphate at 37 °C (for 14-base oligos), 48 °C (for 17-base oligos), 55 

15 °C (for 20-base oligos), and 60 °C (for 23-base oligos). Suitable ranges of such stringency 
conditions for nucleic acids of varying compositions are described in Krause and Aaronson 
(1991), Methods in Enzymology, 200:546-556 in addition to Maniatis et al., cited above. 

These nucleic acid molecules may act as target gene antisense molecules, useful, for 
20 example, in target gene regulation and/or as antisense primers in amplification reactions of target 
gene nucleic acid sequences. Further, such sequences may be used as part of ribozyme and/or 
triple helix sequences, also useful for target gene regulation. Still ftirther, such molecules may be 
used as components of diagnostic methods whereby the presence of an allele causing a disease 
associated with abnormal HDAC9 expression or activity, for example, abnormal cell 
25 proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 
response, or psoriasis, may be detected. 

The invention also encompasses (a) vectors that contain at least a fragment of any of the 

foregoing nucleotide sequences and/or their complements (i.e., antisense); (b) vector molecules, 

30 preferably vector molecules comprising transcriptional control sequences, in particular 

expression vectors, that contain any of the foregoing coding sequences operatively associated 
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with a regulatory element that directs the expression of the coding sequences; and (c) genetically 
engineered host cells that contain a vector molecule as mentioned herein or at least a fragment of 
any of the foregoing nucleotide sequences operatively associated with a regulatory element that 
directs the expression of the coding sequences in the host cell. As used herein, regulatory 

5 elements include, but are not limited to, inducible and non-inducible promoters, enhancers, 

operators and other elements known to those skilled in the art that drive and regulate expression. 
Preferably, host cells can be vertebrate host cells, preferably mammalian host cells, such as 
human cells or rodent cells, such as CHO or BHK cells. Likewise preferred, host cells can be 
bacterial host cells, in particular E.coli cells. 

10 Particularly preferred is a host cell, in particular of the above described type, which can 

be propagated in vitro and which is capable upon growth in culture of producing an HDAC9 
polypeptide, in particular a polypeptide comprising or consisting of an amino acid sequence set 
forth in SEQ ID NO:l, wherein said cell contains some fragment or complete sequence of 
HDAC9 coding sequence in a construct that is controlled by one or more transcriptional control 

1 5 sequences that is not a transcriptional control sequence of the natural endogeneous human gene 
encoding said polypeptide, wherein said one or more transcriptional control sequences control 
transcription of a DNA encoding said polypeptide. Possible transcriptional control sequences 
include, but are not limited to, bacterial or viral promoter sequences. 

The invention includes the complete sequence of the gene as well as fragments of any of 

20 the nucleic acid sequences disclosed herein. Fragments of the nucleic acid sequences encoding 
the novel HDAC9 polypeptide may be used as a hybridization probe for a cDNA library to 
isolate other genes which have a high sequence similarity to the HDAC9 gene or similar 
biological activity. Probes of this type preferably have at least about 30 bases and may contain, 
for example, from about 30 to about 50 bases, about 50 to about 100 bases, about 100 to about 

25 200 bases, or more than 200 bases. The probe may also be used to identify a cDNA clone 

corresponding to a full length transcript and a genomic clone or clones that contain the complete 
HDAC9 gene including regulatory and promoter regions, exons, and introns. An example of a 
screen comprises isolating the coding region of the HD AC9 gene by using the known DNA 
sequence to synthesize an oligonucleotide probe. Labeled oligonucleotides having a sequence 

30 complementary to that of the gene of the present invention may be used to screen a library of 
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human cDNA, genomic DNA or mRNA to determine which members of the library to which the 
probe hybridizes. 

In addition to the gene sequences described above, homologs of such sequences, as may, 
for example, be present in other species, may be identified and may be readily isolated, without 
5 undue experimentation, by molecular biological techniques well known in the art. Further, there 
may exist genes at other genetic loci within the genome that encode proteins which have 
homology to one or more domains of such gene products. These genes may also be identified via 
similar techniques. For example, the isolated nucleotide sequence of the present invention 
encoding the novel HDAC9 polypeptide may be labeled and used to screen a cDNA library 

10 constructed from mRNA obtained from the organism of interest. Hybridization conditions will 
be of a lower stringency when the cDNA library is derived from an organism different from the 
type of organism from which the labeled sequence was derived. Alternatively, the labeled 
fragment may be used to screen a genomic library derived from the organism of interest, again, 
using appropriately stringent conditions. Such low stringency conditions will be well known to 

15 those of skill in the art, and will vary predictably depending on the specific organisms from 

which the library and the labeled sequences are derived. For guidance regarding such conditions 
see, for example, Sambrook et al. cited above. 

Further, a previously unknown differentially expressed gene-type sequence may be 
isolated by performing PCR using two degenerate oligonucleotide primer pools designed on the 

20 basis of amino acid sequences within the gene of interest The template for the reaction may be 
cDNA obtained by reverse transcription of mRNA prepared from human or non-human cell lines 
or tissue known or suspected to express a differentially expressed gene allele. The PCR product 
may be subcloned and sequenced to ensure that the amplified sequences represent the sequences 
of a differentially expressed gene-like nucleic acid sequence. The PCR fragment may then be 

25 used to isolate a complete cDNA clone by a variety of conventional methods. For example, the 
amplified fragment may be labeled and used to screen a bacteriophage cDNA library. 
Alternatively, the labeled fragment may be used to screen a genomic library. 

PCR technology may also be utilized to isolate full length cDNA sequences. For 
example, RNA may be isolated, following standard procedures, from an appropriate cellular or 

30 tissue source. A reverse transcription reaction may be performed on the RNA using an 
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oligonucleotide primer specific for the most 5' end of the amplified fragment for the priming of 
first strand synthesis. The resulting RNA/DNA hybrid may then be "tailed 11 with guanines using 
a standard terminal transferase reaction, the hybrid may be digested with RNAase H, and second 
strand synthesis may then be primed with a poly-C primer. Thus, cDNA sequences upstream of 
5 the amplified fragment may easily be isolated. For a review of cloning strategies which may be 
used, see e.g., Sambrook et al., 1989, supra. 

In cases where the gene identified is the normal, or wild type, gene, this gene may be ' 
used to isolate mutant alleles of the gene. Such an isolation is preferable in processes and 
disorders which are known or suspected to have a genetic basis. Mutant alleles may be isolated 

10 from individuals either known or suspected to have a genotype which contributes to disease 
symptoms related to abnormal HDAC activity, including, but not limited to, conditions such as 
abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, host 
inflammatory or immune response, or psoriasis. Mutant alleles and mutant allele products may 
then be utilized in the diagnostic assay systems described below. 

15 A cDNA of the mutant gene may be isolated, for example, by using PCR, a technique 

which is well known to those of skill in the art In this case, the first cDNA strand may be 
synthesized by hybridizing an oligo-dT oligonucleotide to mRNA isolated from tissue known or 
suspected to be expressed in an individual putatively carrying the mutant allele, and by extending 
the new strand with reverse transcriptase. The second strand of the cDNA is then synthesized 
. 20 using an oligonucleotide that hybridizes specifically to the 5' end of the normal gene. Using these 
two primers, the product is then amplified via PCR, cloned into a suitable vector, and subjected 
to DNA sequence analysis through methods well known to those of skill in the art By comparing 
the DNA sequence of the mutant gene to that of the normal gene, the mutation(s) responsible for 
the loss or alteration of function of the mutant gene product can be ascertained. 

25 Alternatively, a genomic or cDNA library can be constructed and screened using DNA or 

RNA, respectively, from a tissue known to or suspected of expressing the gene of interest in an 
individual suspected of or known to carry the mutant allele. The normal gene or any suitable 
fragment thereof may then be labeled and used as a probe to identify the corresponding mutant 
allele in the library. The clone containing this gene may then be purified through methods 

30 routinely practiced in the art, and subjected to sequence analysis as described above. 
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Additionally, an expression library can be constructed utilizing DNA isolated from or 
cDNA synthesized from a tissue known to or suspected of expressing the gene of interest in an 
individual suspected of or known to carry the mutant allele. In this manner, gene products made 
by the putatively mutant tissue may be expressed and screened using standard antibody screening 
5 techniques in conjunction with antibodies raised against the normal gene product, as described 
below. (For screening techniques, see, for example, Harlow, E. and Lane, eds., 1988, 
"Antibodies: A Laboratory Manual", Cold Spring Harbor Press, Cold Spring Harbor.) In cases 
where the mutation results in an expressed gene product with altered function (e.g., as a result of 
a missense mutation), a polyclonal set of antibodies are likely to cross-react with the mutant gene 

10 product. Library clones detected via their reaction with such labeled antibodies can be purified 
and subjected to sequence analysis as described above. 

The present invention includes those proteins encoded by nucleotide sequences set forth 
in any of SEQ ID NOs:2, 3, 4, 7 or 8 in particular, a polypeptide that is or includes the amino 
acid sequence set out in SEQ ID NO:l, 5 or 6 or fragments thereof. 

1 5 Furthermore, the present invention includes proteins that represent functionally 

equivalent gene products. Such an equivalent differentially expressed gene product may contain 
deletions, additions or substitutions of amino acid residues within the amino acid sequence 
encoded by the differentially expressed gene sequences described, above, but which result in a 
silent change, thus producing a functionally equivalent differentially expressed gene product. 

20 Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. 

For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include 
glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged 

25 (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Functionally equivalent," as utilized herein, may 
refer to a protein or polypeptide capable of exhibiting a substantially similar in vivo or in vitro 
activity as the endogenous differentially expressed gene products encoded by the differentially 
expressed gene sequences described above. "Functionally equivalent" may also refer to proteins 

30 or polypeptides capable of interacting with other cellular or extracellular molecules in a manner 
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substantially similar to the way in which the corresponding portion of the endogenous 
differentially expressed gene product would. For example, a "functionally equivalent* peptide 
would be able, in an immunoassay, to diminish the binding of an antibody to the corresponding 
peptide (i.e., the peptide the amino acid sequence of which was modified to achieve the 
5 "functionally equivalent" peptide) of the endogenous protein, or to the endogenous protein itself, 
where the antibody was raised against the corresponding peptide of the endogenous protein. An 
equimolar concentration of the functionally equivalent peptide will diminish the aforesaid 
binding of the corresponding peptide by at least about 5%, preferably between about 5% and 
10%, more preferably between about 10% and 25%, even more preferably between about 25% 

10 and 50%, and most preferably between about 40% and 50%. 

The polypeptides of the present invention may be produced by recombinant DNA 
technology using techniques well known in the art. Therefore, there is provided a method of 
producing a polypeptide of the present invention, which method comprises culturing a host cell 
having incorporated therein an expression vector containing an exogenously-derived 

15 polynucleotide encoding a polypeptide comprising an amino acid sequence as set forth in SEQ 
ID NOs:l, 5 or 6 under conditions sufficient for expression of the polypeptide in the host cell, 
thereby causing the production of the expressed polypeptide. Optionally, said method further 
comprises recovering the polypeptide produced by said cell. In a preferred embodiment of such a 
method, said exogenously-derived polynucleotide encodes a polypeptide consisting of an amino 

20 acid sequence set forth in SEQ ID NOs:l, 5 or 6 Preferably, said exogenously-derived 

polynucleotide comprises the nucleotide sequence as set forth in any of SEQ ID NO:2, SEQ ID 
NO:3, SEQ ID NO:4, SEQ ID NO: 7 or SEQIDNO:8. In case of using the nucleotide 
sequence set forth in SEQ ID NO:3, i.e. the open reading frame, the sequence, when inserted into 
a vector, may be followed by one or more appropriate translation stop codons, preferably by the 

25 natural endogenous stop codon TGA beginning at nucleotide 2021 in the cDNA sequence. 

Thus, methods for preparing the polypeptides and peptides of the invention by expressing 
nucleic acid encoding respective nucleotide sequences are described herein. Methods which are 
well known to those skilled in the art can be used to construct expression vectors containing 
protein coding sequences and appropriate transcriptional/translational control signals. These 

30 methods include, for example, in vitro recombinant DNA techniques, synthetic techniques and in 
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vivo recombination/genetic recombination. See, for example, the techniques described in 
Sambrook et al., 1989, supra, and Ausubel et al., 1989, supra. Alternatively, RNA capable of 
encoding differentially expressed gene protein sequences may be chemically synthesized using, 
for example, synthesizers. See, for example, the techniques described in "Oligonucleotide 
5 Synthesis", 1984, Gait, M. J. ed., IRL Press, Oxford, which is incorporated by reference herein in 
its entirety. 

A variety of host-expression vector systems may be utilized to express the HDAC9 gene 
coding sequences of the invention. Such host-expression systems represent vehicles by which the 
coding sequences of interest may be produced and subsequently purified, but also represent cells 

10 which may, when transformed or transfected with the appropriate nucleotide coding sequences, 
exhibit the HDAC9 gene protein of the invention in situ. These include but are not limited to 
microorganisms such as bacteria (e.g., E. coli, B. subtilis) transformed with recombinant 
bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing differentially 
expressed gene protein coding sequences; yeast (e.g. Saccharomyces, Kchia) transformed with 

15 recombinant yeast expression vectors containing the differentially expressed gene protein coding 
sequences; insect cell systems infected or transfected with recombinant virus expression vectors 
(e.g., baculovirus) containing the differentially expressed gene protein coding sequences; plant 
cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or transformed with recombinant vectors, including 

20 plasmids, (e.g., Ti plasmid) containing protein coding sequences; or mammalian cell systems 
(e.g. COS, CHO, BHK, 293, 3T3) harboring recombinant expression constructs containing 
promoters derived from the genome of mammalian cells (e.g., metallothioneine promoter) or 
from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter, 
or the CMV promoter). 

25 Expression of the HDAC9 of the present invention by a cell from an HDAC9 encoding 

gene that is native to the cell can also be performed. Methods for such expression are detailed in, 
e.g., U.S. Patents 5,641,670; 5,733,761; 5,968,502; and 5,994,127, all of which are expressly 
incorporated by reference herein in their entirety. Cells that have been induced to express 
HDAC9 by the methods of any of U.S. Patents 5,641,670; 5,733,761; 5,968,502; and 5,994,127 
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can be implanted into a desired tissue in a living animal in order to increase the local 
concentration of HDAC9 in the tissue. 

In bacterial systems, a number of expression vectors may be advantageously selected 
depending upon the use intended for the protein being expressed. For example, when a large 

i 

5 quantity of such a protein is to be produced, for the generation of antibodies or to screen peptide 
libraries, for example, vectors which direct the expression of high levels of fusion protein 
products that are readily purified may be desirable. In this respect, fusion proteins comprising 
hexahistidine tags may be used, such as EpiTag vectos including pCDNA3.1/His (Invitrogen, 
Carlsbad, CA). Other vectors include, but are not limited, to the E. coli expression vector 

10 pUR278 (Ruther et al., 1983, EMBO J. 2:1791), in which the protein coding sequence may be 
ligated individually into the vector in frame with the lac Z coding region so that a fusion protein 
is produced; pIN vectors (Inouye & Ihouye, 1985, Nucleic Acids Res. 13:3101-3109; Van Heeke 
& Schuster, 1989, J. Biol. Chem. 264:5503-5509); and the like. pGEX vectors may also be used 
to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In 

15 general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption 
to glutathione-agarose beads followed by elution in the presence of free glutathione. The pGEX 
vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned 
target gene protein can be released from the GST moiety. Fusion proteins containing Flag tags, 
such as 3X Flag (Sigma, St. Louis, MO) or myc tags, for example pCDNA3 . 1/myc-His 

20 (Invitrogen, Carlsbad, CA) may be used. These fusions allow coimmunoprecipitation and 
Western detection of proteins for which antibodies are not yet available. 

Promoter regions can be selected from any desired gene using vectors that contain a 
reporter transcription unit lacking a promoter region, such as a chloramphenicol acetyl 
transferase ("CAT"), or the luciferase transcription unit, downstream of restriction site or sites 

25 for introducing a candidate promoter fragment; i.e., a fragment that may contain a promoter. For 
example, introduction into the vector of a promoter-containing fragment at the restriction site 
upstream of the cat gene engenders production of CAT activity, which can be detected by 
standard CAT assays. Vectors suitable to this end are well known and readily available. Two 
such vectors are pKK232-8 and pCM7. Thus, promoters for expression of polynucleotides of the 
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present invention include not only well known and readily available promoters, but also 
promoters that readily may be obtained by the foregoing technique, using a reporter gene. 

Among known bacterial promoters suitable for expression of polynucleotides and 
polypeptides in accordance with the present invention are the E. coli lad and lacZ promoters, the 
5 T3 and T7 promoters, the T5 tac promoter, the lambda PR, PL promoters and the trp promoter. 
Among known eukaryotic promoters suitable in this regard are the CMV immediate early 
promoter, the HSV thymidine kinase promoter, the early and late S V40 promoters, the promoters 
of retroviral LTRs, such as those of die Rous sarcoma virus ("RSV"), and metallothionein 
promoters, such as the mouse metallothionein-I promoter. For example, a plasmid construct 

10 could contain a HDAC9 transcriptional control sequence fused to a reporter transcription unit 
that encodes the coding region of 0-Galactosidase, chloramphenicol acetyltransferase, green 
fluorescent protein or luciferase . This construct could be used to screen for small molecules that 
modulate HDAC9 transcription. Such molecules are potential therapeutics. Furthermore, an 
HDAC9 reporter gene could be used to examine the effects of an HDAC9 therapeutic in 

15 mammalian cells or xenografts using fluorescent reporters and imaging techniques, such as 

fluorescence microscopy or Biophotonic in vivo imaging, a technology that produces visual and 
quantitative measurements in real time (Xenogen, Palo Alto, CA). Changes in these reporters in 
normal, diseased or drug-treated tissue or cells would be indicators of changes in HDAC9 
expression or activity. 

20 

In an insect system, Autographa californica nuclear polyhedrosis virus (AcNPV) is one 
of several insect systems that can be used as a vector to express foreign genes. The virus grows 
in Spodoptera frugiperda cells. The coding sequence may be cloned individually into non- 
essential regions (for example the polyhedrin gene) of the virus and placed under control of an 

25 AcNPV promoter (for example the polyhedrin promoter). Successful insertion of the coding 
sequence will result in inactivation of the polyhedrin gene and production of non-occluded 
recombinant virus (i.e., virus lacking the proteinaceous coat coded for by the polyhedrin gene). 
These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the 
inserted gene is expressed (e.g., see Smith et al., 1983, J. Virol. 46: 584; Smith, U.S. Pat. No. 

30 4,215,051). 
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In mammalian host cells, a number of viral-based expression systems may be utilized. In 
cases where an adenovirus is used as an expression vector, the coding sequence of interest may 
be ligated to an adenovirus transcription/translation control complex, e.g., the late promoter and 
tripartite leader sequence. This chimeric gene may then be inserted in the adenovirus genome by 
5 in vitro or in vivo recombination. Insertion in a non-essential region of the viral genome (e.g., 
region El or E3) will result in a recombinant virus that is viable and capable of expressing the 
desired protein in infected hosts (e.g., See Logan & Shenk, 1984, Proc. Natl. Acad. Sci. USA 
81 :3655-3659). Specific initiation signals may also be required for efficient translation of 
inserted gene coding sequences. These signals include the ATG initiation codon and adjacent 

10 sequences. In cases where an entire gene, including its own initiation codon and adjacent 

sequences, is inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only a portion of the gene coding sequence is 
inserted, exogenous translational control signals, including, perhaps, the ATG initiation codon, 
must be provided. Furthermore, the initiation codon must be in phase with the reading frame of 

15 the desired coding sequence to ensure translation of the entire insert. These exogenous 

translational control signals and initiation codons can be of a variety of origins, both natural and 
synthetic. The efficiency of expression may be enhanced by the inclusion of appropriate 
transcription enhancer elements, transcription terminators, etc. (see Bittner et al., 1987, Methods 
in Enzymol. 153:516-544). Other common systems are based on SV40, retrovirus or adeno- 

20 associated virus. Selection of appropriate vectors and promoters for expression in a host cell is a 
well known procedure and the requisite techniques for expression vector construction, 
introduction of the vector into the host and expression in the host per se are routine skills in the 
art. Generally, recombinant expression vectors will include origins of replication, a promoter 
derived from a highly-expressed gene to direct transcription of a downstream structural 

25 sequence, and a selectable marker to permit isolation of vector containing cells after exposure to 
the vector. 

Li addition, a host cell strain may be chosen which modulates the expression of the 
inserted sequences, or modifies and processes the gene product in the specific fashion desired. 
Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may 
30 be important for the function of the protein. Different host cells have characteristic and specific 
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mechanisms for the post-translational processing and modification of proteins. Appropriate cell 
lines or host systems can be chosen to ensure the correct modification and processing of the 
foreign protein expressed. To this end, eukaiyotic host cells which possess the cellular 
machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of 
5 the gene product may be used. Such mammalian host cells include but are not limited to CHO, 
VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, etc. and are well known to one of skill in 
the art. 

For long-term, high-yield production of recombinant proteins, stable expression is 
preferred. For example, cell lines that stably express a differentially expressed protein product of 

10 a gene may be engineered. Rather than using expression vectors which contain viral origins of 
replication, host cells can be transformed with DNA controlled by appropriate expression control 
elements (e.g., promoter, enhancer, sequences, transcription terminators, polyadenylation sites, 
etc.), and a selectable marker. Following the introduction of the foreign DNA, engineered cells 
may be allowed to grow for 1-2 days in an enriched media, and then are switched to a selective 

15 media. The selectable marker in the recombinant plasmid confers resistance to the selection and 
allows cells to stably integrate the plasmid into their chromosomes and grow to form foci which 
in turn can be cloned and expanded into cell lines. This method may advantageously be used to 
engineer cell lines that express the differentially expressed gene protein. Such engineered cell 
lines maiy be particularly useful in screening and evaluation of compounds that affect the 

20 endogenous activity of the expressed protein. 

A number of selection systems may be used, including but not limited to, the herpes 
simplex virus thymidine kinase (Wigler, et al., 1977, Cell 11223), hypoxanthine-guanine 
phosphoribosyltransferase (Szybalska & Szybalski, 1962, Proc. Natl. Acad. Sci. USA 482026), 
and adenine phosphoribosyltransferase (Lowy, et al., 1980, Cell 22:817) genes can be employed 

25 in tk", hgprf or aprf cells, respectively. Also, antimetabolite resistance can be used as the basis of 
selection for dhfr, which confers resistance to methotrexate (Wigler, et al., 1980, Natl. Acad. Sci. 
USA 773567; OHare, et al., 1981, Proc. Natl. Acad. Sci. USA 78:1527); gpt, which confers 
resistance to mycophenolic acid (Mulligan & Berg, 1981, Proc. Natl. Acad. Sci. USA 785072); 
neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin, et al., 1981, J. 
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Mol. Biol. 150:1); and hygro, which confers resistance to hygromycin (Santerre, et aL, 1984, 
Gene 30:147) genes. 

An alternative fusion protein system allows for the ready purification of non-denatured 
fusion proteins expressed in human cell lines (Janknecht, et aL, 1991, Proc. Natl. Acad. Sci. USA 
5 88 : 8972-8976). In this system, the gene of interest is subcloned into a vaccinia recombination 
plasmid such that the gene's open reading frame is translationally fused to an amino-terminal tag 
consisting of six histidine residues. Extracts from cells infected with recombinant vaccinia virus 
are loaded onto Ni 2+ nitriloacetic acid-agarose columns and histidine-tagged proteins are 
selectively eluted with imidazole-containing buffers. 

10 When used as a component in assay systems such as those described below, a protein of 

the present invention may be labeled, either directly or indirectly, to facilitate detection of a 
complex formed between the protein and a test substance. Any of a variety of suitable labeling 
systems may be used including, but not limited to, radioisotopes such as I; enzyme labeling 
systems that generate a detectable calorimetric signal or light when exposed to substrate; and 

15 fluorescent labels. 

Where recombinant DNA technology is used to produce a protein of the present 
invention for such assay systems, it may be advantageous to engineer fusion proteins that can 
facilitate labeling, immobilization, detection and/or isolation 

Indirect labeling involves the use of a protein, such as a labeled antibody, which 

20 specifically binds to a polypeptide of the present invention. Such antibodies include but are not 
limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments 
produced by an Fab expression library. 

In another embodiment, nucleic acids comprising a sequence encoding HDAC9 protein 
25 or functional derivative thereof, may be administered to promote normal biological function, for 
example, normal transcriptional regulation, by way of gene therapy. Gene therapy refers to 
therapy performed by the administration of a nucleic acid to a subject In this embodiment of the 
invention, the nucleic acid produces its encoded protein that mediates a therapeutic effect by 
promoting normal transcriptional regulation.. 

32 
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Any of the methods for gene therapy available in the art can be used according to the 
present invention. Exemplary methods are described below. 

In a preferred aspect, the therapeutic comprises a HD AC9 nucleic acid that is part of an 
expression vector that expresses a HDAC9 protein or fragment or chimeric protein thereof in a 
5 suitable host In particular, such a nucleic acid has a promoter operably linked to the HDAC9 
coding region, said promoter being inducible or constitutive, and, optionally, tissue-specific. In 
another particular embodiment, a nucleic acid molecule is used in which the HDAC9 coding 
sequences and any other desired sequences are flanked by regions that promote homologous 
recombination at a desired site in the genome, thus providing for intrachromosomal expression of 

10 the HDAC9 nucleic acid (Koller and Smithies, 1989, Proc. Natl. Acad. Sci. USA 86:8932-8935; 
Zijlstra et al., 1989, Nature 342:435-438). 

Delivery of the nucleic acid into a patient may be either direct, in which case the patient 
is directly exposed to the nucleic acid or nucleic acid-carrying vector, or indirect, in which case, 
cells are first transformed with the nucleic acid in vitro, then transplanted into the patient These 

15 two approaches are known, respectively, as in vivo or ex vivo gene therapy. 

In a specific embodiment, the nucleic acid is direcdy administered in vivo, where it is 
expressed to produce the encoded product. This can be accomplished by any of numerous 
methods known in the art, e.g., by constructing it as part of an appropriate nucleic acid 
expression vector and administering it so that it becomes intracellular, e.g., by infection using a 

20 defective or attenuated retroviral or other viral vector (see, e.g., U.S . Pat No. 4,980,286 and 
others mentioned infra), or by direct injection of naked DNA, or by use of microparticle 
bombardment (e.g., a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface 
receptors or transfecting agents, encapsulation in liposomes, microparticles, or microcapsules, or 
by administering it in linkage to a peptide which is known to enter the nucleus, by administering 

25 it in linkage to a ligand subject to receptor-mediated endocytosis (see e.g., U.S. Patents 

5,166,320; 5,728,399; 5,874,297; and 6,030,954, all of which are incorporated by reference 
herein in their entirety) (which can be used to target cell types specifically expressing the 
receptors), etc. In another embodiment, a nucleic acid-ligand complex can be formed in which 
the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the nucleic acid to 

30 avoid lysosomal degradation. In yet another embodiment, the nucleic acid can be targeted in vivo 
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for cell specific uptake and expression, by targeting a specific receptor (see, e.g., PCT 
Publications WO 92/06180; WO 92/22635; WO92/20316; W093/14188; and WO 93/20221). 
Alternatively, the nucleic acid can be introduced intracellularly and incorporated within host cell 
DNA for expression, by homologous recombination (see, e.g., U.S. Patents 5,413,923; 
5 5,416,260; and 5,574,205; and Zijlstra et al., 1989, Nature 342:435-438). 

In a specific embodiment, a viral vector that contains the HDAC9 nucleic acid is used. 
For example, a retroviral vector can be used (see, e.g., U.S. Patents 5,219,740; 5,604,090; and 
5,834,182). These retroviral vectors have been modified to delete retroviral sequences that are 
not necessary for packaging of the viral genome and integration into host cell DNA. The HDAC9 
10 nucleic acid to be used in gene therapy is cloned into the vector, which facilitates delivery of the 
gene into a patient 

Adenoviruses are other viral vectors that can be used in gene therapy. Adenoviruses are 
especially attractive vehicles for delivering genes to respiratory epithelia. Adenoviruses naturally 
infect respiratory epithelia where they cause a mild disease. Other targets for adeno virus-based 

15 delivery systems are liver, the central nervous system, endothelial cells, and muscle. 

Adenoviruses have the advantage of being capable of infecting non-dividing cells. Methods for 
conducting adenovirus-based gene therapy are described in, e.g., U.S. Patents 5,824,544; 
5,868,040; 5,871,722; 5,880,102; 5,882,877; 5,885,808; 5,932,210; 5,981,225; 5,994,106; 
5,994,132; 5,994,134; 6,001,557; and 6,033,8843, all of which are incorporated by reference 

20 herein in their entirety. 

Adeno-associated virus (AAV) has also been proposed for use in gene therapy. Methods 
for producing and utilizing AAV are described, e.g., in U.S. Patents 5,173,414; 5,252,479; 
5,552,311; 5,658,785; 5,763,416; 5,773,289; 5,843,742; 5,869,040; 5,942,496; and 5,948,675, all 
of which are incorporated by reference herein in their entirety. 

25 Another approach to gene therapy involves transferring a gene to cells in tissue culture by 

such methods as electroporation, lipofection, calcium phosphate mediated transfection, or viral 
infection. Usually, the method of transfer includes the transfer of a selectable marker to the cells. 
The cells are then placed under selection to isolate those cells that have taken up and are 
expressing the transferred gene. Those cells are then delivered to a patient. 
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In this embodiment, the nucleic acid is introduced into a cell prior to administration in 
vivo of the resulting recombinant cell. Such introduction can be carried out by any method 
known in the art, including but not limited to transfection, electroporation, microinjection, 
infection with a viral or bacteriophage vector containing the nucleic acid sequences, cell fusion, 
5 chromosome-mediated gene transfer, micro cell-mediated gene transfer, spheroplast fusion, etc. 
Numerous techniques are known in the art for the introduction of foreign genes into cells and 
may be used in accordance with the present invention, provided that the necessary developmental 
and physiological functions of the recipient cells are not disrupted. The technique should provide 
for the stable transfer of the nucleic acid to the cell, so that the nucleic acid is expressible by the 

1 0 cell and preferably heritable and expressible by its cell progeny. 

The resulting recombinant cells can be delivered to a patient by various methods known 
in the art. In a preferred embodiment, epithelial cells are injected, e.g., subcutaneously. In 
another embodiment, recombinant skin cells may be applied as a skin graft onto the patient 
Recombinant blood cells (e.g., hematopoietic stem or progenitor cells) are preferably 

15 administered intravenously. The amount of cells envisioned for use depends on the desired 
effect, patient state, etc., and can be determined by one skilled in the art. 

Cells into which a nucleic acid can be introduced for purposes of gene therapy encompass 
any desired, available cell type, and include but are not limited to epithelial cells, endothelial 
cells, keratinocytes, fibroblasts, muscle cells, hepatocytes; blood cells such as T lymphocytes, B 

20 lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes; 
various stem or progenitor cells, in particular hematopoietic stem or progenitor cells, e.g., as 
obtained from bone marrow, umbilical cord blood, peripheral blood, fetal liver, etc. 

In a preferred embodiment, the cell used for gene therapy is autologous to the patient 
In an embodiment in which recombinant cells are used in gene therapy, a HDAC9 nucleic 

25 acid is introduced into the cells such that it is expressible by the cells or their progeny, and the 
recombinant cells are then administered in vivo for therapeutic effect In a specific embodiment, 
stem or progenitor cells are used. Any stem-and/or progenitor cells that can be isolated and 
maintained in vitro can potentially be used in accordance with this embodiment of the present 
invention. Such stem cells include but are not limited to hematopoietic stem cells (HSQ, stem 

30 cells of epithelial tissues such as the skin and the lining of the gut, embryonic heart muscle cells, 
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liver stem cells (see, e.g., WO 94/08598), and neural stem cells (Stemple and Anderson, 1992, 
Cell 71:973-985). 

Epithelial stem cells (ESCs) or keratinocytes can be obtained from tissues such as the 
skin and the lining of the gut by known procedures (Rheinwald, 1980, Meth. Cell Bio. 21 A:229). 

5 In stratified epithelial tissue such as the skin, renewal occurs by mitosis of stem cells within the 
germinal layer, the layer closest to the basal lamina. Stem cells within the lining of the gut 
provide for a rapid renewal rate of this tissue. ESCs or keratinocytes obtained from the skin or 
lining of the gut of a patient or donor can be grown in tissue culture (Pittelkow and Scott, 1986, 
Mayo Clinic Proc. 61 :771). If the ESCs are provided by a donor, a method for suppression of 

10 host versus graft reactivity (e.g., irradiation, drug or antibody administration to promote 
moderate immunosuppression) can also be used. 

With respect to hematopoietic stem cells (HSC), any technique which provides for the 
isolation, propagation, and maintenance in vitro of HSC can be used in this embodiment of the 
invention. Techniques by which this may be accomplished include (a) the isolation and 

15 establishment of HSC cultures from bone marrow cells isolated from the future host, or a donor, 
or (b) the use of previously established long-term HSC cultures, which may be allogeneic or 
xenogeneic. Non-autologous HSC are used preferably in conjunction with a method of 
suppressing transplantation immune reactions of the future host/patient. In a particular 
embodiment of the present invention, human bone marrow cells can be obtained from the 

20 posterior iliac crest by needle aspiration (see, e.g., Kodo et al., 1984, J. Clin. Invest. 73:1377- 
1384). In a preferred embodiment of the present invention, the HSCs can be made highly 
enriched or in substantially pure form. This enrichment can be accomplished before, during, or 
after long-term culturing, and can be done by any techniques known in the art. Long-term 
cultures of bone marrow cells can be established and maintained by using, for example, modified 

25 Dexter cell culture techniques (Dexter et aL, 1977, J. Cell Physiol. 91 :335) or Witlock-Witte 
culture techniques (Witlock and Witte, 1982, Proc. NatL Acad. Sci. USA 79:3608-3612). 

In a specific embodiment, die nucleic acid to be introduced for purposes of gene therapy 
comprises an inducible promoter operably linked to the coding region, such that expression of 
the nucleic acid is controllable by controlling the presence or absence of the appropriate inducer 

30 of transcription. 
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A farther embodiment of the present invention relates to a purified antibody or a 
fragment thereof which specifically binds to a polypeptide that comprises the amino acid 
sequence set forth in SEQ ID NOs:l , 5 or 6 or to a fragment of said polypeptide. A preferred 
5 embodiment relates to a fragment of such an antibody, which fragment is an Fab or FCab^ 
fragment In particular, the antibody can be a polyclonal antibody or a monoclonal antibody. 

Described herein are methods for the production of antibodies capable of specifically 
recognizing one or more differentially expressed gene epitopes. Such antibodies may include, but 
are not limited to polyclonal antibodies, monoclonal antibodies (mAbs), humanized or chimeric 

10 antibodies, single chain antibodies, Fab fragments, F(ab')2 fragments, fragments produced by a 
Fab expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any 
of the above. Such antibodies may be used, for example, in the detection of a fingerprint, target, 
gene in a biological sample, or, alternatively, as a method for the inhibition of abnormal target 
gene activity. Thus, such antibodies may be utilized as part of disease treatment methods, and/or 

15 may be used as part of diagnostic techniques whereby patients may be tested for abnormal levels 
of the HDAC9 polypeptide, or for the presence of abnormal forms of the HDAC9 polypeptide. 

For the production of antibodies to the HDAC9 polypeptide, various host animals may be 
immunized by injection with the HDAC9 polypeptide, or a portion thereof. Such host animals 
may include but are not limited to rabbits, mice, and rats, to name but a few. Various adjuvants 

20 may be used to increase the immunological response, depending on the host species, including 
but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, 
surface active substances such as lysolecithin, phironic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants 
such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

25 Polyclonal antibodies are heterogeneous populations of antibody molecules derived from 

the sera of animals immunized with an antigen, such as target gene product, or an antigenic 
functional derivative thereof. For the production of polyclonal antibodies, host animals such as 
those described above, may be immunized by injection with the HDAC9 polypeptide, or a 
portion thereof, supplemented with adjuvants as also described above. 
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Monoclonal antibodies, which are homogeneous populations of antibodies to a particular 
antigen, may be obtained by any technique which provides for the production of antibody 
molecules by continuous cell lines in culture. These include, but are not limited to the hybridoma 
technique of Kohler and Milstein, (1975, Nature 256:495^97; and U.S. Pat. No. 4,376,110), the 

5 human B-cell hybridoma technique (Kosbor et aL, 1983, Immunology Today 4:72; Cole et al., 
1983, Proc. NatL Acad. ScL USA 80:2026-2030), and the EBV-hybridoma technique (Cole et 
al., 1985, Monoclonal Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Such 
antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD and any 
subclass thereof. The hybridoma producing the mAb of this invention may be cultivated in vitro 

10 or in vivo. Production of high titers of mAbs in vivo makes this the presently preferred method 
of production. 

In addition, techniques developed for the production of "chimeric antibodies" (Morrison 
et al., 1984, Proc. Natl. Acad. ScL, 81:6851-6855; Neuberger et al., 1984, Nature, 312:604-608; 
Takeda et al., 1985, Nature, 314:452-454) by splicing the genes from a mouse antibody molecule 

15 of appropriate antigen specificity together with genes from a human antibody molecule of 

appropriate biological activity can be used. A chimeric antibody is a molecule in which different 
portions are derived from different animal species, such as those having a variable or 
hypervariable region derived from a murine mAb and a human immunoglobulin constant region. 
Alternatively, techniques described for the production of single chain antibodies (U.S. 

20 Pat. No. 4,946,778; Bird, 1988, Science 242:423-426; Huston et al., 1988, Proc. Natl. Acad. Sci. 
USA 85:5879-5883; and Ward et al., 1989, Nature 334:544-546) can be adapted to produce 
differentially expressed gene-single chain antibodies. Single chain antibodies are formed by 
linking the heavy and light chain fragments of the Fv region via an amino acid bridge, resulting 
in a single chain polypeptide. 

25 Most preferably, techniques useful for the production of "humanized antibodies" can be 

adapted to produce antibodies to the polypeptides, fragments, derivatives, and functional 
equivalents disclosed herein. Such techniques are disclosed in U.S. Patent Nos. 5,932, 448; 
5,693,762; 5,693,761; 5,585,089; 5,530,101 ; 5,910,771; 5,569,825; 5,625,126; 5,633,425; 
5,789,650; 5,545,580; 5,661,016; and 5,770,429, the disclosures of all of which are incorporated 

30 by reference herein in their entirety. 
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Antibody fragments that recognize specific epitopes may be generated by known 
techniques. For example, such fragments include but are not limited to: the FCab^ fragments 
which can be produced by pepsin digestion of the antibody molecule and the Fab fragments 
which can be generated by reducing the disulfide bridges of the F(ab*)2 fragments. Alternatively, 

5 Fab expression libraries may be constructed (Huse et aL, 1989, Science, 246:1275-1281) to allow 
rapid and easy identification of monoclonal Fab fragments with the desired specificity. 

An antibody of the present invention can be preferably used in a method for the diagnosis 
of a condition associated with abnormal HDAC9 expression or activity, for example, abnormal 
cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or 

10 immune response, or psoriasis, in a human which comprises: measuring the amount of a 

polypeptide comprising the amino acid sequence set forth in SEQ ID NOs:l , 5 or 6, or fragments 
thereof, in an appropriate tissue or cell from a human suffering from a condition associated with 
abnormal HDAC9 activity, wherein the presence of an elevated amount of said polypeptide or 
fragments thereof, relative to the amount of said polypeptide or fragments thereof in the 

15 respective tissue from a human not suffering from a condition associated with abnormal HDAC9 
activity is diagnostic of said human's suffering from such condition. Such a method forms a 
further embodiment of the present invention. Preferably, said detecting step comprises contacting 
said appropriate tissue or cell with an antibody which specifically binds to a polypeptide that 
comprises die amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 or a fragment thereof and 

20 detecting specific binding of said antibody with a polypeptide in said appropriate tissue or cell, 
wherein detection of specific binding to a polypeptide indicates the presence of a polypeptide 
that comprises the amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 or a fragment thereof. 

Particularly preferred, for ease of detection, is the sandwich assay, of which a number of 
variations exist, all of which are intended to be encompassed by the present invention. 

25 For example, in a typical forward assay, unlabeled antibody is immobilized on a solid 

substrate and the sample to be tested brought into contact with the bound molecule. After a 
suitable period of incubation, for a period of time sufficient to allow formation of an antibody- 
antigen binary complex. At this point, a second antibody, labeled with a reporter molecule 
capable of inducing a detectable signal, is then added and incubated, allowing time sufficient for 

30 the formation of a ternary complex of antibody-antigen-labeled antibody. Any unieacted material 
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is washed away, and the presence of the antigen is determined by observation of a signal, or may 
be quantitated by comparing with a control sample containing known amounts of antigen. 
Variations on the forward assay include the simultaneous assay, in which both sample and 
antibody are added simultaneously to the bound antibody, or a reverse assay in which the labeled 
5 antibody and sample to be tested are first combined, incubated and added to the unlabeled 
surface bound antibody. These techniques are well known to those skilled in the art, and the 
possibility of minor variations will be readily apparent. As used herein, "sandwich assay" is 
intended to encompass all variations on the basic two-site technique. For the immunoassays of 
the present invention, the only limiting factor is that the labeled antibody be an antibody which is 

10 specific for the HDAC9 polypeptide or a fragment thereof. 

The most commonly used reporter molecules in this type of assay are either enzymes, 
fluorophore- or radionuclide-containin g molecules. In the case of an enzyme immunoassay an 
enzyme is conjugated to the second antibody, usually by means of glutaraldehyde or periodate. 
As will be readily recognized, however, a wide variety of different ligation techniques exist, 

15 which are well-known to the skilled artisan. Commonly used enzymes include horseradish 
peroxidase, glucose oxidase, beta-galactosidase and alkaline phosphatase, among others. The 
substrates to be used with the specific enzymes are generally chosen for the production, upon 
hydrolysis by the corresponding enzyme, of a detectable color change. For example, p- 
nitrophenyl phosphate is suitable for use with alkaline phosphatase conjugates; for peroxidase 

20 conjugates, 1,2-phenylenediamine or toluidine are commonly used. It is also possible to employ 
fluoxogenic substrates, which yield a fluorescent product rather than the chromogenic substrates 
noted above. A solution containing the appropriate substrate is then added to the tertiary 
complex. The substrate reacts with the enzyme linked to the second antibody, giving a qualitative 
visual signal, which may be further quantitated, usually spectrophotometrically, to give an 

25 evaluation of the amount of HDAC9 which is present in the serum sample. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be 
chemically coupled to antibodies without altering their binding capacity. When activated by 
illumination with light of a particular wavelength, the fluorochrome-labeled antibody absoibs the 
light energy, inducing a state of excitability in the molecule, followed by emission of the light at 

30 a characteristic longer wavelength. The emission appears as a characteristic color visually 
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detectable with a light microscope. Immunofluorescence and EIA techniques are both very well 
established in the art and are particularly preferred for the present method. However, other 
reporter molecules, such as radioisotopes, chemiluminescent or bioluminescent molecules may 
also be employed. It will be readily apparent to the skilled artisan how to vary the procedure to 

5 suit the required use. 

This invention also relates to the use of polynucleotides of the present invention as 
diagnostic reagents. In particular, the invention relates to a method for the diagnosis of a 
condition associated with abnormal HDAC9 expression or activity, for example, abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 

10 response, or psoriasis in a human which comprisesrdetecting elevated transcription of messenger 
RNA transcribed from the natural endogeneous human gene encoding the polypeptide consisting 
of an amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 in an appropriate tissue or cell 
from a human, wherein said elevated transcription is diagnostic of said human's suffering from 
the condition associated with abnormal HDAC9 expression or activity. In particular, said natural 

15 endogeneous human gene comprises the nucleotide sequence set forth in SEQ ID NO:4. 7 or 8. 
In a preferred embodiment such a method comprises contacting a sample of said appropriate 
tissue or cell or contacting an isolated RNA or DNA molecule derived from that tissue or cell 
with an isolated nucleotide sequence of at least about 20 nucleotides in length that hybridizes 
under high stringency conditions with the isolated nucleotide sequence encoding a polypeptide 

20 consisting of an amino acid sequence set forth in SEQ ID NOs :1 , 5 or 6. 

Detection of a mutated form of the gene characterized by the polynucleotide of SEQ ID 
NO:4 7 or 8 which is associated with a dysfunction will provide a diagnostic tool that can add 
to, or define, a diagnosis of a disease, or susceptibility to a disease, which results from under- 
expression, over-expression or altered spatial or temporal expression of the gene. Individuals 

25 carrying mutations in the gene may be detected at the DNA level by a variety of techniques. 

Nucleic acids, in particular mRNA, for diagnosis may be obtained from a subject's cells, 
such as from blood, urine, saliva, tissue biopsy or autopsy material. The genomic DNA may be 
used directly for detection or may be amplified enzymatically by using PCR or other 
amplification techniques prior to analysis. RNA or cDNA may also be used in similar fashion. 

30 Deletions and insertions can be detected by a change in size of the amplified product in 
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comparison to the normal genotype. Point mutations can be identified by hybridizing amplified 
DNA to labeled nucleotide sequences encoding the HDAC9 polypeptide of the present invention. 
Perfectly matched sequences can be distinguished from mismatched duplexes by RNase 
digestion or by differences in melting temperatures. DNA sequence differences may also be 

5 detected by alterations in electrophoretic mobility of DNA fragments in gels, with or without 
denaturing agents, or by direct DNA sequencing (e.g., Myers et al., Science (1985) 230:1242). 
Sequence changes at specific locations may also be revealed by nuclease protection assays, such 
as RNase and SI protection or the chemical cleavage method (see Cotton et al., Proc Natl Acad 
Sci USA (1985) 85: 4397-4401). In another embodiment, an array of oligonucleotides probes 

1 o comprising nucleotide sequence encoding the HDAC9 polypeptide of the present invention or 
fragments of such a nucleotide seqeunce can be constructed to conduct efficient screening of 
e.g., genetic mutations. Array technology methods are well known and have general applicability 
and can be used to address a variety of questions in molecular genetics including gene 
expression, genetic linkage, and genetic variability (see for example: M. Chee et al., Science, Vol 

15 274, pp 610-613 (1996)). 

The diagnostic assays offer a process for diagnosing or determining a susceptibility to 
disease through detection of mutation in the HDAC9 gene by the methods described. In addition, 
such diseases may be diagnosed by methods comprising determining from a sample derived from 
a subject an abnormally decreased or increased level of polypeptide or mRNA. Decreased or 

20 increased expression can be measured at the RNA level using any of the methods well known in 
the art for the quantitation of polynucleotides, such as, for example, nucleic acid amplification, 
for instance PCR, RT-PCR, RNase protection, Northern blotting and other hybridization 
methods. Assay techniques that can be used to determine levels of a protein, such as a 
polypeptide of the present invention, in a sample derived from a host are well-known to those of 

25 skill in the art. Such assay methods include radioimmunoassays, competitive-binding assays, 
Western Blot analysis and ELIS A assays. 

Thus in another aspect, the present invention relates to a diagnostic kit which comprises: 
(a) a polynucleotide of the present invention, preferably the nucleotide sequence of SEQ 
ID NO:2, 3, 4, 7 or 8 or a fragment thereof; 

30 (b) a nucleotide sequence complementary to that of (a); 
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(c) a polypeptide of the present invention, preferably the polypeptide of SEQ ID NOsrl , 5 
or 6 or a fragment thereof; or 

(d) an antibody to a polypeptide of the present invention, preferably to the polypeptide of 
SEQIDNOs:l,5or6. 

5 It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 

component. Such a kit will be of use in diagnosing a disease or susceptibility to a disease, 
particularly to a disease or condition associated with abnormal HDAC9 expression or activity, 
for example, abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, 
host inflammatory or immune response, or psoriasis. 

10 

The nucleotide sequences of the present invention are also valuable for chromosome 
localization. The sequence is specifically targeted to, and can hybridize with, a particular 
location on an individual human chromosome. The mapping of relevant sequences to 
chromosomes according to the present invention is an important first step in correlating those 

15 sequences with gene associated disease. Once a sequence has been mapped to a precise 
chromosomal location, the physical position of the sequence on the chromosome can be 
correlated with genetic map data. Such data are found in, for example, V. McKusick, Mendelian 
Inheritance in Man (available on-line through Johns Hopkins University Welch Medical 
Library). The relationship between genes and diseases that have been mapped to the same 

20 chromosomal region are then identified through linkage analysis (coinheritance of physically 
adjacent genes). 

The differences in the cDNA or genomic sequence between affected and unaffected 
individuals can also be determined. If a mutation is observed in some or all of the affected 
individuals but not in any normal individuals, then the mutation is likely to be the causative 
25 agent of the disease. 

An additional embodiment of the invention relates to the administration of a 
pharmaceutical composition, in conjunction with a pharmaceutically acceptable carrier, excipient 
or diluent, for any of the therapeutic effects discussed above. Such pharmaceutical compositions 
may consist of HDAC9, antibodies to that polypeptide, mimetics, agonists, antagonists, or 
30 inhibitors of HDAC9 function. The compositions may be administered alone or in combination 
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with at least one other agent, such as stabilizing compound, which may be administered in any 
sterile, biocompatible pharmaceutical carrier, including, but not limited to, saline, buffered 
saline, dextrose, and water. The compositions may be administered to a patient alone, or in 
combination with other agents, drugs or hormones. 

5 In addition, any of the therapeutic proteins, antagonists, antibodies, agonists, antisense 

sequences or vectors described above may be administered in combination with other appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be 
made by one of ordinary skill in the art, according to conventional pharmaceutical principles. 
The combination of therapeutic agents may act synergistically to effect the treatment or 

10 prevention of the various disorders described above. Using this approach, one may be able to 
achieve therapeutic efficacy with lower dosages of each agent, thus reducing the potential for 
adverse side effects. Antagonists and agonists of HDAC9 may be made using methods which 
are generally known in the art. 

The pharmaceutical compositions encompassed by the invention may be administered by 

15 any number of routes including, but not limited to, oral, intravenous, intramuscular, intra- 
articular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, 
intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain 
suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 

20 facilitate processing of the active compounds into preparations which can be used 

pharmaceutical^. Further details on techniques for formulation and administration may be found 
in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton, 
Pa.). 

Pharmaceutical compositions for oral administration can be formulated using 
25 pharmaceutical^ acceptable carriers well known in the art in dosages suitable for oral 

administration. Such carriers enable die pharmaceutical compositions to be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by 
the patient 

Pharmaceutical preparations for oral use can be obtained through combination of active 
30 compounds with solid excipient, optionally grinding a resulting mixture, and processing the 
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mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. 
Suitable excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, 
mannitol, or sorbitol; starch from com, wheat, rice, potato, or other plants; cellulose, such as 
methyl cellulose, hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums 
5 including arabic and tragacanth; and proteins such as gelatin and collagen. If desired, 
disintegrating or solubilizing agents may be added, such as the cross-linked polyvinyl 
pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated 
sugar solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel, 

10 polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or 
solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for 
product identification or to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or 

15 sorbitol. Push-fit capsules can contain active ingredients mixed with a filler or binders, such as 
lactose or starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In 
soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as 
fatty oils, liquid, or liquid polyethylene glycol with or without stabilizers. 

Pharmaceutical formulations suitable for parenteral administration may be formulated m 

20 aqueous solutions, preferably in physiologically compatible buffers such as Hanks 1 solution, 

Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds may be 
prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles 

25 include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Non-lipid polycationic amino polymers may also be used for 
delivery. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase die solubility of the compounds to allow for the preparation of highly concentrated 
solutions. 
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For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in die formulation. Such penetrants are generally known in the art. 

The pharmaceutical compositions of the present invention may be manufactured in a 
manner that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, 
5 dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 
etc. Salts tend to be more soluble in aqueous or other pro tonic solvents than are the 
corresponding free base forms. In other cases, the preferred preparation may be a lyophilized 
10 powder which may contain any or all of the following: 1-50 mM histidine, 0. 1 %-2% sucrose, 
and 2-7% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an 
appropriate container and labeled for treatment of an indicated condition. For administration of 
the HDAC9, such labeling would include amount, frequency, and method of administration. 
15 Pharmaceutical compositions suitable for use in the invention include compositions 

wherein the active ingredients are contained in an effective amount to achieve the intended 
purpose. The determination of an effective dose is well within the capability of those skilled in 
the art. 

for any compound, the therapeutically effective dose can be estimated initially either in 
20 cell culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or 
pigs. The animal model may also be used to determine the appropriate concentration range and 
route of administration. Such information can then be used to determine useful doses and routes 
for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 
25 HDAC9 or fragments thereof, antibodies of HD AC9, agonists, antagonists or inhibitors of 

HDAC9, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 
50% of die population). The dose ratio between toxic and therapeutic effects is the therapeutic 
30 index, and it can be expressed as fee ratio, LD50/ED50. Pharmaceutical compositions which 
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exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The dosage contained in 
such compositions is preferably within a range of circulating concentrations that include the 
ED50 with little or no toxicity. The dosage varies within this range depending upon the dosage 
5 form employed, sensitivity of the patient, and the route of administration. 

Hie exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient 
levels of the active moiety or to maintain the desired effect. Factors which may be taken into 
account include the severity of the disease state, general health of the subject, age, weight, and 

10 gender of the subject, diet, time and frequency of administration, drug combination (s), reaction 
sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions may 
be administered every 3 to 4 days, every week, or once every two weeks depending on half-life 
and clearance rate of the particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 

15 about 1 g, depending upon die route of administration. Guidance as to particular dosages and 

methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or 
their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to 
particular cells, conditions, locations, etc. Pharmaceutical formulations suitable for oral 

20 administration of proteins are described, e.g., in U.S. Patents 5,008,114; 5,505,962; 5,641,515; 
5,681,811; 5,700,486; 5,766,633; 5,792,451; 5,853,748; 5,972,387; 5,976,569; and 6,051,561. 

The following Examples illustrate the present invention, without in any way limiting the 
scope thereof. 

25 

Examples 

Example 1 : Identification of a novel HDAC related human DNA sequence using bioinformatics 
HDAC9 was identified using computer software for the identification of new members of gene 
families based on a strategy to find maximal evolutionary links among known HDAC family 
30 members by first searching the non-redundant amino acid database, followed by searching less 
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diverse databases such as the Celera Human Genome Database (CHGD), public High 
Throughput Genomic (HTG) database and the Incyte LIFESEQ™ database. Smith- Waterman 
(Pearson W. R. Comparison of methods for searching protein sequence databases. Protein Sci 
(1995) 4,1 145-60) and Hidden Markov Models (probability models derived from diversity of 
5 amino acids at every position (Eddy S. R. Hidden Markov models. Curr Opin Struct Biol (1996) 
6, 361-5) were performed. An 1 156 bp open reading frame (ORF) was identified and used to 
search a database of sequenced clones from pan-tissue and dorsal root ganglion cDNA libraries. 

Example 2: Construction of pan-tissue and dorsal root ganglion cDNA libraries 

10 Pan-tissue and dorsal root ganglion cDNA libraries are prepared from polyA+ RNA. Total RNA 
was extracted from a pooled sample of 31 human tissues or dorsal root ganglia and isolated using 
TREOL reagent according to manufacturer's instructions (Life Technologies, Rockville, MD). 
mRNA is isolated using Polytract mRNA Isolation System m according to manufacturer's 
instructions (Promega, Madison, WI). Total RNA is hybridized to a biotinylated-oligo (dt) probe. 

15 The oligo (dt)-mRNA hybrids are captured on streptavidin magnesphere particles and eluted in 
Rnase-free H 2 0. 3 ul of biotinylated-oligo(dt) probe (50 pmol/ul) and 13 ul of 20X SSC is added 
to 60-150 ug of RNA that is heated to 65°C in RNase free water. This mixture is incubated at 
room temperature until it is completely cooled. Streptavidin-paramagnetic particles (beads) are 
resuspended and washed 3 times in 0.5X SSC and then resuspended in 0.5X SSC. The RNA- 

20 oIigo(dt) hybrids from the previous step are added to these beads. To release the poly-A RNA 
from the beads, the beads are resuspended in Rnase-free water and magnetically captured and 
then the eluate from the beads is ethanol precipitated. First and second strand cDNA synthesis is 
performed using a modified procedure from Life Technologies (D'Alessio, J. M., Gruber, C.E., 
Cain, C. R., and Noon, M. C. (1990) Focus 12, 47). First strand synthesis is performed by 

25 incubating 1-5 ug of RNA that is heated to 60°C in IX 1 st strand buffer (Life Technologies)/6 
mM DTT/600 nM dNTPs/2 units anti-Rnase. This mixture is incubated at 40°C for2 min, then 
Superscript II reverse transcriptase (RT) and 1 ul of Display Thermo RT terminator mix is added 
and the mixture is incubated at 40°C for 1 h, followed by incubation at 60°C for 10 min. Second 
strand synthesis is performed in lx second strand buffer (Life Technologies) in DEPC-H2O/66 

30 nM/1 ul E.coli DNA ligase/4 ul E. coli DNA polymerase 1/1 ul E. coli Rnase H. This mixture is 
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incubated at 10°C for 10 min and then at 16°C for 2h. To this mixture, 2 ul of T4 DNA 
polymerase is added and incubation is continued at 16°C for 5 min. The reaction is stopped with 
10 ul of 0.5M EDTA, extracted with phenol/chlorofonn/isoamyl alcohol and then ethanol 
precipitated. Sal I and Not I adaptors are added to the 5* ends of the cDNAs by ligation for 
5 directional cloning using conventional methodology. The cDNAs are then passed through a size 
fractionation column to retrieve cDNAs that are >500 bp in length according to manufacturers 
instructions (Life Technologies, Rockville, MD). cDNAs are li gated to Sal I/Not I digested 
Gateway compatible pCMV-Sport6 vector (Life Technologies, Rockville, MD) using 
conventional methods. Competent DH10B cells (life Technologies, Rockville, MD) are 
10 transformed with the resulting library using conventional methods. Semi-solid amplification of 
the libraries is performed according to the manufacturer's instructions (Life Technologies, 
Rockville, MD). 

Example 3: Preparation of full length cDNA encoding the novel HDAC9 consisting of SEP ID 

15 NO:L5 or 6: An 1 156 base pair ORF was used to search a database of sequenced clones from 
pan-tissue and dorsal root ganglion cDNA libraries using BLAST. Four clones were found to 
contain the ORF (M6, K10, P3, F23), two from each library. Of these clones M6 from the pan- 
tissue library was determined to be the most complete, but missing approximately 44 bp from the 
N-tenninus. A protein slightly smaller than that predicted for the complete cDNA was observed 

20 by in vitro translation. The result that proteins were observed by in vitro translation of the 

incomplete cDNA, suggests possibility of alternate translation initiation sites within HDAC9. 
Specifically, sequencing of HDAC9 in pCMVSport6 was performed using an automated ABI 
Sequencer (ACGT, Northbrook, 1L). PCR was performed using conditions listed in the ABI 
Prism BigDye™ Terminator Cycle Sequencing Ready Reaction Kit manual and are as follows: 

25 denaturation at 96°C for 30 seconds, annealing at 50° C for 1 5 seconds, extension at 60*C for 4 
minutes, for a total of 25 cycles. Each round of sequencing provided between 200 and 600 bp of 
sequence. PCR primers for 1 st round sequencing were 5-ATTTAGGTGACACTATAG -3' (Sp6, 
sense) and 5-TAATACGACTCACTATAGGG -3' (T7, antisense). Results of sequencing using 
Sp6 primer are as follows. Bolded sequence is pCMVSport6 vector sequence. 

30 CrggtACCGGTCCGGAATTCCCGGGATATCGTCGACCCACGCGTCCG/GGCTGCT 
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CCCGGCCGAAGCCCCGAGTGCGAGATCGAGCGTCCTGAGCGCCTGACCGCAGCCCT 
GGATCGCCTGCGGCAGCGCGGCCTGGAACAGAGGTGTCTGCGGTTGTCAGCCCGCG 
AGGCCTCGGAAGAGGAGCTGGGCCTGGTGCACAGCCCAGAGTATGTATCCCTGGTC 
AGGGAGACCCAGGTCCTAGGCAAGGAGGAGCTGCAGGCGCTGTCCGGACAGTTCGA 
5 CGCCATCTACTTCCACCCGAGTACCTTTCACTGCGCGCGGCTGGCCGCAGGGGCTGG 
ACTGCAGCTGGTGGACGCTGTGCTCACTGGAGCTGTGCA:AAATGGGCTTGCCCTGG 
TGAGGCCTCCCGGGCACCATGGCCAGAGGGCGGCTGCAACGGGTTCTGCGTGTTCA 
ACAACGTGGCCATAGCAGCTGCACATGCCAAGCAGAAACACGGGCTACACAGGATC 
CTCGTCGTGGACTGGGGGATGTGCACCATGGCAGGGGGATCCAGTATCTCTTTGAAG 

10 GATGACCCCAGCGTCCTTTACTTCTCCTGGCACCGCTATGAGCATTGGGCGCCTTCT 
GGCCTTTCTGCGAGAGTCAGATGAgACGCATGGGGGGCGGGGGACAGGGCCTCGGC 
TTCACTGTCaACCTGCCCTGACCAAGTTgGGGGAATGGGGAAACGCTGACTTACGTG 
GCTGGCCITCTTGCACCnrTGCTGGTTCCAcTGGCCTTTTGGAGTTTGACCTGAgCTGG 
GTGCTTGGTcTCgGCAGGGATTTGACTcagcCaTtCgGGACCCTGAgGGGGCAAA. Results 

15 of sequencing using the 17 primer were: 

TCAAGCCACCAGGTGAGGATGGCACTGCAACATCTTCCACTGAGGCTCCAGCTGCCC 
TCTCAGGTACATCAGGGCCTGGACGTCCTCTGGGGAGGCCACAGAGGAAGGGCCTA 
GGCTAGGAGGTGCCTCTCCATTCAGCACCCGGGCCAGGATCCCTGCTAGCTGGGGTG 
TGGAGTTCTCCTCCAGGAGGGCCAGGACTCGGCCCCCTGCCAGCCCCCGAAGCATTG 

20 CAGCCAGGAGTGCAGCGTGGGGGCCCTGCAGGCCATGGCCAGGCCCCAGCGCCACC 
AGCACCAGGTCAGGCTGGAAGCCATAGGCCAGGGGCAGCaCCAAGCCCAAGATGCA 
GCTCAGGAAACCACCGGTCATCACTGGCAGTGGCGTGGAGACATGGAACATGGA[T 
AGGGCAGcCGCCTCCTTGCCCTGATGTTCAGCCACAGACTcCTCCCGTCATGGGCGA 
AGTCTGGAGGCCGGTCCAgCTGTtaGGCCACGCACAGAgtCTCTGGGCTCCgtGGGACA 

25 gGCCT:TTTtGAAAAGAtATTtAGGGTGGGTTGTGAacaggGCTGGAATGGCTGGTATAcC 
AcTGtTT AcCTGCCATT. 2 nd and 3 rd round sequencing primers are designed to prime sequence 
obtained from the previous round of sequencing. 2 nd round primers are 5*-GTCATCA 
CTGGCAGTGGCGTG -3' (HUF7392, antisense) and 5'-TGGACTGCAGCTGGTGG -3" (DF-2, 
sense). Results of sequencing using the DF-2 primer were: CTGGcAAATGGGCTTGCCCTGG 

30 TGAGGCCrcCCGGGCACCATGGCCAGAGGGCGGCTGCCAACGGGTTCTGCGTGTTC 
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AACAACGTGGCCATAGCAGCTGCACATGCCAAGCAGAAACACGGGCTACACAGGAT 
CCTCGTCGTGGACTGGGATGTGCACCATGGCCAGGGGATCCAGTATCTCTTTGAGGA 
TGACCCCAGCGTCCTTTACTTCTCCTGGCACCGCTATGAGCATGGGCGCTTCTGGCCT 
TTCCTGCGAGAGTCAGATGCAGACGCAGTGGGGCGGGGACAGGGCCTCGGCTTCAC 
5 TGTCAACCTGCCCTGGAACCAGGTTGGGATGGGAAACGCTGACTACGTGGCTGCCTt 
cCTGCACCTGCTGCTCCCACTGGCCTTTGAGTTTGACCCTGAGCTGGTGCTGGTCTCG 
GCAGGATTTGACTCAGCCATCGGGGACCCTGAGGGGCAAATGCAGGCCACGCCAGA 
GTGCTTCGCCCACCTCACACAGCTGCTGCAGGTGCTGGCCGGCGGCCGGGTCTGTGC 
CGTGCTGGAGGGCGGCTACCACCTGGAGTCACTGGCGGAGTCAGTGTGCATGACAG 

10 TACAGACGCTGCTGGGTGACCCGGcCCCACCCCTGTCAGGGCCAATGGCGCCATGTC 
AGAGTGCCCTAgAgTCATTCAgAGTGCCCGTGCTGCCAGGcCCCGCACTGGAAAgAgG 
CTTCAgCAGCAAgATGTGACCGcTGTGCCGATGAACCCCA Sequencing results for the 
HUF7392 primer were: TGtaTAGGGcAGCCGCCTCCTTGCC 
CCTGATGTTCAGCCACAGACTCCTCCCGTCATGGGCGAGG 

15 TCTGGAGGCCGGTCCAGCTGTCCCAGGGCCACGCACAGCAGCCTCTGGGCTCCGTG 
GGACAGGCCTCTCCGAACAGCCACATCCAGGGTGGCTGCTGCAGCAGAGGCTGGAG 
TGGCTGCrATACCACTGTTCACCTGCCCATCCAGCATCCCATCTAAGAGGTACAGGA 
GCTTCCCAAGTGCAGTGAGGGCCTCCTCCCGGGCCAGGGACTCGTGTGGCCTGGCCC 
AGGCTTCTGTCTCCTCCCTCAGGGCTGACGCTTCTGTTGGATGACGTCAGGGGGCAG 

20 AACCAATGTGATATCCGGCGTTGTCAAGGGCAACAGCGGTGCGGACAGAGGGTGCG 
GGGCAGAGGCACgGCTGGTCCAgGAGGGAGCTCGGTGCAgATGCAGcTGCCTTACAC 
ACTGgACCCCCAGGC AGCAGAGGTGGAGGCCTCCCCTCTGGGGAGTG. 3 ri round 
sequencing primers were 5 '- AAC AGCGGTG C GGACAGA -3' (HUF2A, antisense) and 5 - 
CTGGAGTCACTGGCGGAG -3* (DF3A, sense). Results of sequencing using DF3 A primer 

25 were: AgcaCAGA cGCTgCTGGGTGACCCGGCCCACCCCTG 

TCAGGGCCAATGGCGCCATGTCAGAGTGCCCTAGAGTCCATCCAGAGTGcCCGTGCT 
GCCCAGGCCCCGCACTGGAAGAGCCTCCAGCAGCAAGATGTGACCGCTGTGCCGAT 
GAGCCCCAGCAGCCACTCCCCAGAGGGGAGGCCTCCACCTCTGCTGCCTGGGGGTC 
CAGTGTGTAAGGCAGCTGCATCTGCACCGAGCTCCCTCCTGGACCAGCCGTGCCTCT 

30 GCCCCGCACCCTCTGTCCGCACCGCTGTTGCCCTGACAACGCCGGATATCACATTGG 
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TTCTGCCCCCTGACGTCATCCAACAGGAAGCGTCAGCCCTGAGGGAGGAGACAGAA 
GCCTGGGCCAGGCCACACGAGTCCCTGGCCCGGGAGGAGGCCCTcACTGcAClTGGG 
AAGCTCCTGTACCTcTTAgATGGGATGCT 

Results of sequencing using HUF2A primer were: TgcaCGGATGGTCCAGGAGGGAGCTCG 
5 GTGCAAATGCAGCTGCCTTACACACTG 
CCCTcTGGGGAGTGGCTGCTGGGGC^^ 

AGGCTCTTCCAGTGCGGGGCCTGGGCAGCACGGGCACTCTGGATGGACT 
ACTCTGACATGGCGCCATTGGCCCTGACAGGGGTGGGGCCGGGTCACCCAGCAG 
TCTGTACTGTCATGCACACTGACTCCGCCAGTGACTCCAGGTGGTAGCCGC 
10 GCACGGCACAgACCCGGCCGCCGGCCAGGACCTGCAGCAGCTGTGTGAGGTGGGCg 
AAGCACTCTGGCGTGGCCTGCATTTGCCCCTCAG 

TCCTGCCGAGACCAGCACCAGCTCAGGGTCAAACTCAAAGGCCAGTGGGAGCAGCA 
GGTGCAGGAAGGCAGCCACgTATCAGCGTTTCCCATCCCAACCTGgTTCCAGGGGCA 
GGTTGAACAGTGAAGCCGAGGGCCCCTTGTCCCCgCCCCACCITGCGTCT 

1 5 CTCTCGCAGGAAAGGCCAAgAAGCgCCCATgCTAl'l TT. The overlapping sequence from 
the combined sense and an ti sense sequencing was reconstructed to give the complete cDNA 
sequence of HDAC9. See Figure 2A. 

BLAST is used to search the Genbank database using cDNA clone M6 as the query to 
identify a genomic sequence containing M6 cDNA sequence. The results of this search identified 

20 a genomic sequence AL022328 that was found to contain exons that were identical in sequence 
to the M6 cDNA. The sequence of cDNA clone M6 was confirmed by automated DNA 
sequencing (ACGT, Inc. Northbrook, EL). See Figure 2 A. 

The remaining 44 bp of N-terminal sequence was added by PCR using the nested sense 
strand primers S^GCGGTCGACGCCACCATGGGGACCGCGCTTGTGTACCATGAGGAC 

25 ATG-3* and 5-GTGTACCATGAGGACATGACGGCCACCCGGCTGCTCTGGGACGACC 
CCGAGTGC-3 *and the 3* primer 5 *-GAACCAATGTGAT ATCCGGCGTTG-3 * . The 5'primer 
added a kozak sequence and a Sail site for cloning and the 3' primer sequence overlaps the 
EcoRV site in HDAC9. PCR was performed using a step-cycle file for amplification using 1 
cycle of 94°C for 30 seconds, 68°C for 30 seconds, and 72°C for 1 minute, followed by 20 cycles 

30 of 94°C for 30 seconds and 72°C for 1 minute. 
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Example 3 HDAC9 sequence variants 

Three variants of the HDAC9 sequence, HDAC9vl, HDAC9v2, and HDAC9v3 were 
found. HDAC9vl is the original sequence found and described above. HDAC9v2 was found in 

5 the human dorsal root ganglion cDNA library and in AL022328 genomic sequence. HDAC9v3 is 
apredicited transcript that lacks a stop codon that was found in the Celera human genomic 
database. HDAC9vl contains 20 exons and HDAc9v2 has 20 exons. Comparison of the peptide 
sequences of HDAC9 variants demonstrated that HDAC9vl and HDAC9v2 were identical up to 
exon 17, but diverge after this exon. HDAC9v2 has an extended intron between exon 17 and 18 

10 and an extended exon 18 that contains HDAC9vl exon 19, but lacks 20, as a result of a single 
nucleotide insertion at nucleotide 446. This insertion frame shifts the sequence and shortens the 
peptide by 11 amino acids (Fig 1LA). Compared to HDAC9vl and HDAC9v2, HDAC9v3 has an 
internal deletion of amino acids 219 through 240 and diverges in its C-terminal beginning at 
amino acid 486. HDAC9 is the first HDAC enzyme for which sequence variants have been 

15 reported. HDAC9vl is the sequence variant that is characterized, unless otherwise noted. 

Example: 4 Identification of HDAC-associated sequen ce motifs. 

The M6 clone was analyzed for the presence of motifs that would indicate an HDAC 

catalytic domain and a binding site for Rb and Rb-like proteins. HDACs are characterized by the 

presence of a catalytic domain with conserved amino acids. Most of the HDACs that have been 

20 identified to date have one catalytic domain, with the exception of HDAC6 that has two 

domains. N-terminal catalytic domains have been associated with class I HDACs, while C- 

terminal catalytic domains are associated with class n HDACs. An N-terminal catalytic domain 

was found in HDAC9 based upon PFAM prediction and alignment with the catalytic domains of 
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other HDACs. A set of conserved amino acids were previously shown to be critical for HDAC 
activity and provide the critical contacts for HDAC inhibitor, TSA, based upon single amino acid 
mutations in HDAC1 and the three dimensional structure formed by a complex of an HDAC-like 
protein (HDLP), Zn 2 * and HDAC inhibitor TSA (Hassig CA, Tong JK, Fleischer TC, Owa T, 
5 Grable PG, Ayer DE, Schreiber SL. (1998) Proc Natl Acad Sci U SA. 95, 35 19-3524; Finnin, 
M. S., Doniglan, J. R., Cohen, A., Richon, V. M., Rifkind, R. a., Marks, P. A., Breslow, R., and 
Pavletich, N. P. (1999) Structures of a his tone deacetylase homologue bound to TSA and SAHA 
inhibitors. Nature 401, 188-193). A bacterial protein with similarities in sequence and enzymatic 
activity to human HDACs and the only class I HDAC-like structure elucidated, HDLP was used 

10 as an HDAC template. Many of these conserved amino acids with a few exceptions were found 
in HDAC9 (Table 4). Alignments of HDAC peptide sequences indicated that the hydrophobic 
residue Leu 265 that forms part of the binding pocket in HDLP is replaced with Glu at amino 
acid 272 in HDAC9. Similarly, Leu 265 is also replaced with Met in HDAC8 and with Lys in 
HDAC6 domain 1. Furthermore, Asp 173 in HDLP is substituted with Gin at position 177 in 

1 5 HDAC9 , a difference that was also found in the HDAC6 catalytic domain 1 . This Asp is 

substituted with Asn in HDAC4, HDACS, HDAC6 domain 2, and HDAC7. HDAC 1-8 have been 
shown to be catalytically active, hence the amino acid substitutions in these proteins have no 
enzymatic consequences. 

HDAC9 is similar in sequence to class I and class II HDACs. HDACs have been 

20 classified by their sequence similarity with yeast HDACs Rpd3, Hdal , and Sii2 and by catalytic 
domain location. Alignment of the peptide sequences of HDAC9, yeast HDACs Rpd3, Hdal , 
Hdal subfamily member from fission yeast, cryptic loci regulator 3 (Clr3), and Sif2 determined 
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that HDAC9 had the highest sequence similarity with Cli3 (Table 1). However, the sequence 
similarity is not high enough to categorize HDAC9. 

Alignment of human HDACs 1-9 and Sir 1-7 peptide sequences demonstrated that 
HDAC9 was most similar to class II human HDAC6 (Table 2). Alignment of class I and class II 
5 HDAC catalytic domains with HDAC9 catalytic domains demonstrated that HDAC6 catalytic 
domain 1 has the most sequence similarity with HDAC9 (Table 3). 

In order to compare the locations of catalytic domains in HDACs, PFAM predictions 
were made of the catalytic domains in HDAC peptides (Fig. 1 IB). The location of HDAC9 
catalytic domain was at the N-terminus, similar to class I HDACs, and was estimated as 
10 spanning the amino acid sequence from amino acid 4 to 323. In addition, the average length of 
class I HDACs is 443 amino acids, while the average length of class II HDACs is 1069 amino 
acids. The 673 amino acid HDAC9 peptide is between the average sizes of class I and class II 
HDACs (Fig. 115). 



Table 1. 
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HDAC 


%Similarity to 


Class 
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HDAC9 


Class I 


Rpd3 


16 


Class II 


Hdal 


18 




Clr3 


23 


Class m 


Sir2 
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"able 2. 


HDAC 


HDAC 


% Similarity to 


Class 
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HDAC9 


Class I 


HDAC1 


14 




HDAC2 


15 




HDAC3 


15 




HDAC8 


22 


Class n 


HDAC4 


21 




HDAC5 


19 




HDAC6 \ 


37 




HDAC7 


20 
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Class m 


Sill 


5 




Sir2 


7 




Sir3 


11 




Sir4 


4 




Sir5 


8 




Sir6 


10 




Sir7 


15 


'able 3. 


HDAC 


HDAC 


% Similarity to 


Class 


Isofonn 


HDAC9 


Class I j 


HDAC1 


20 




HDAC2 


20 




HDAC3 


20 




HDAC8 


19 


Class II 


HDAC4 


39 




HDAC5 


38 




HDAC6-1 


55 




HDAC6-2 


53 




HDAC7 


40 



5 The protein product of the retinoblastoma protein (Rb) gene is a transcriptional regulator 

that controls DNA synthesis, the cell cycle, differentiation and apoptosis and plays a tissue- 
specific role normal development Rb complexes with the transcription factor E2F, an interaction 
that is regulated by phosphorylation. Mutations in Rb lead to a hereditary form of cancer of the 
retina, retinoblastoma. Mutations have also been found in a number of mesenchymal and 

10 epithelial cancers. Mutations that affect regulators of Rb phosphorylation including, cyclin Dl , 
cdk4, and pl6 have been found in many cancers. Therefore, Rb function is thought to play a 
critical role in tumorigenesis (Sellers, W.R., Kaelin, W.G. Jr. (1997) /. Clin. Oncol 15, 3301- 
3312, DiCiommo, D., Gallie, B.L., Bremner, R.(2000) Semin. Cancer Biol 10, 255-269). An Rb- 
binding motif was previously defined as the amino acid sequence LXCXE, where "X" can be 

15 any amino acid (Chen, T.-T. and Wang, J. Y. J. (2000) Mol Cell Biol 20, 5571-5580 ). The 

LXCXE domain in HDAC1 was found to be dispensible for growth suppression function of Rb, 
but necessary for HDAC binding to Rb. Two putative Rb-binding motifs were found in HDAC9 
(Fig. 1 1A, green boxes). LLCVA is located between amino acids 510 and 515, and LSCIL 
located between amino acids 560 and 564. Both are present in HDAC9vl and HDAC9v2. 
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flic domains of HDAC isoforms 
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Example 5mRNA distribution of HDAC9 in normal tissues 

mRNA distribution of HDAC9 in normal tissues is investigated using Northern analysis. 
Probes are prepared by 32 P-labeling a 750 bp EcorV/Notl HDAC9 fragment using Redi-Prime 
5 random nucleotide labelling kit according to manufacturer's instructions (Amersham, 

Piscataway, NJ). A Northern blot containing polyA+ RNA froml2 normal tissues (Qrigene 
Technologies, Rockville, MD) and an array of matched tumor versus normal cDNAs (Clontech, 
Palo Alto, CA) are probed with the [ 32 P]-labeled 750 bp EcorV/Notl HDAC9 fragment and 
washed under high stringency conditions (68°C). Hybridized blots are washed two times for 15 
10 min at 68°C in 2 X SSC /0.1% SDS followed by two 30 min washes in 0.1 X SSC/0.1% SDS at 
68°C. The blot is exposed to film with an intensifying screen for 1 8 hr. Results indicate that an 
approximately 3.0 Kb HDAC9 mRNA is detected in brain, colon, heart kidney, liver, lung, 
placenta, small intestine, spleen, stomach and testes. HDAC9 message was not detected in 
muscle, but GAPDH was also not detected. See Figure 7. 

15 

Analogous computer techniques using BLAST (Altshul, S.F. 1993, 1990 reft) are used to 
search for identical or related molecules in nucleotide databases such as GenBank or the 
LIFESEQ™ database. The basis of the search is the product score which is defined as: 
% sequence identity x % maximum BLAST score 
20 100 



The product score takes into account both the degree of similarity between two sequences and 
the length of the sequence match. For example, with a product score of 40, the match will be 
exact within a 1-2% error; and at 70, the match will be exact. Homologous molecules are 
25 usually identified by selecting those which show product scores between 15 and 40, although 
lower scores may identify related molecules. 

The results of Northern analysis are reported as a list of libraries in which the transcript 
encoding HDAC9 occurs. Abundance and percent abundance are also reported. Abundance 
directly reflects the number of times a particular transcript is represented in a cDNA library, and 
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percent abundance is abundance divided by the total number of sequences examined in the 
cDNA library. 

Jh this case, electronic Northern analysis of LIFESEQ™ database (Incyte 
Pharmaceuticals, Inc. Palo Alto, Calif) indicates tissue distribution of the HDAC9 sequence as 
5 seen in Table 5. These results are reported as a list of cDNA libraries in which the transcript 
encoding HDAC9 occurs. The presence of HDAC9 in 20 libraries from different tissue-specific 
and mixed tissue sources indicates that HDAC9, like other HDAC family members may be 
found as an expressed gene in a wide range of tissues- This result is supported by the Northern 
hybridization of an HDAC9 probe to mRNAs from 12 normal tissues (see Figure 7). 

10 

Table 5. Tissue distribution determined electronically from LIFESEQ™ database. 

Tissue Category 

Cardiovascular System 

Connective Tissue 

Digestive System 

Embryonic Structures 

Endocrine System 

Exocrine Glands 

Genitalia, Female 

Genitalia, Male 

Germ Cells 

Hemic and Immune System 

Liver 

Musculoskeletal System 

Nervous System 

Pancreas 

Respiratory System 

Sense Organs 

Skin 

Stomatognathlc System 

Unclassified/Mixed 

Urinary Tract 

Example 6: Real time PCR survey of HDAC9 distribtaion in human normal tissues and cell 
lines. 

1 5 Real Time PCR. Total RNA from cultured cell lines was isolated with the Rneasy 96 kit 

according to the manufacturers protocol (Qiagen, Valencia CA). RNA from human tissues was 
purchased (Clontech Lie, Palo Alto, CA) and the tissue sources are listed in table 6 below. 
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Table 6. Tissue sources of RNA for real time PCR analysis 







Age range 


Number of 




Sex of 


of donor 


samples 


Tissue 


donor 


(yrs.) 


pooled 


Brain 1 


M 


57 


1 


Brain 2 


F 


16&36 


2 


Cerebellum 


M 


64 


1 


Spinal cord 


M/F 


17-72 


31 


Fetal brain 


M/F 


20-23 wks 


8 


Trachea 


M/F 


17-70 


84 


Liver 1 


M 


27 


1 


Liver 2 


M/F 


15&35 


2 


Fetal liver 


? 


15-24 wks 


? 


Stomach 


M/F 


23-61 


15 


Pancreas 


M/F 


17-69 


18 


Colon 


M 


35&50 


2 


Intestine 


M/F 


25&30 


2 


Kidney 


M/F 


24-55 


8 


Bone 


M/F 


18-68 


24 


marrow 








Spleen 


M 


22-60 


7 


Thymus 


M 


6-45 


9 


Thyroid 


M/F 


10-46 


4 


Adrenal 


M 


32-50 


6 


gland 








Salivary 


M/F 


13-78 


43 


gland 








Mammary 


F 


23-47 


8 


gland 








Skeletal 


M/F 


23-56 


10 


muscle 








Testis 


M 


28-64 


25 


Prostate 1 


M 


26-64 


23 


Prostate 2 


M 


14-60 


10 


Placenta 


F 


22-41 


15 


Numbers following tissues represent separate samples 



same tissue type: Male (M). Female (F) 



5 Human cell lines, H1299 human lung carcinoma, T24 bladder carcinoma, SJRH30 muscle 

rhabdomyosarcoma, SJSA-1 osteosarcoma, human fibroblasts, and A549 human lung carcinoma, 
were obtained from American Type Tissue Culture Collection. Total RNA was isolated from 
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human cell lines using RNA easy kit according to the manufacturers instructions (Qiagen, 
Valencia, CA ). RNAs were quantified using RT-PCR on an ABI Prism Sequence Detection 
System. The primers used for detection of HDAC9 were forward primer 5'- 
GGATCCAGTATCTCTT TGAGGATGAC-3\ reverse primer 5'- 

5 AG AAGCGCCCATGCTCATA-3 \ and Taqman probe 5 ' - AGCGTCCTTT ACT 

TCTCCTGGCACCG-3\ TheTaqman Reaction System (Eurogentec, Belgium) was used with 10 
ng total RNA in a 25 \xl reaction in the proportions indicated by the manufacturer but 
supplemented with 0.25 U/fil reverse transcriptase (MultiScribe ABI, Perkin Elmer, Branchburg 
NJ) and 0.08 U/(il RNaseOUT RNAse inhibitor (Life Technologies, Gaithersburg, MD). The 

10 reverse reaction was initiated with a 5 min incubation at 48 °C for the reverse transcription of the 
mRNA followed by a 10 min incubation at 95 °C to inactivate the reverse transcriptase and 
simultaneously activate the 'hot-start' thermostable DNA polymerase. This was followed by 50 
cycles of a two-step PCR reaction with alternating 15 sec at 95 °C and 60 sec at 60 °C. 
Computations were performed using ABI sequence detection software (version 1 .63). The RT- 

15 PCR assays were standardized with cRNAs transcribed in vitro with the T7 RNA polymerase 
reaction using the Maxiscript kit (AMBION Inc., Austin, TX) according to the manufacturers 
protocol. The RT-PCR assays were standardized with a dilution series of total RNA isolated 
from A549 lung tumor cells. Parallel to the RT-PCR, the total amount of RNA in each reaction 
was quantitated in a fluorometric assay using the RiboGreen kit (Molecular Probes Inc., address) 

20 according to the manufacturers instructions, using mammalian ribosomal RNA provided with the 
kit as standard. 

Real time PCR was also used to survey the distribution and levels of HDAC9 in tissues 
and tumor cell lines, relative to die levels of 1 8S ribosomal RNA . RNA from the human A549 
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lung carcinoma cell line was arbitrarily chosen as an internal control for the levels of total RNA 
in the samples. The levels of HDAC9 and 18S rRNA in A549 cells were set at 100 % and the 
levels of HDAC9 and 18S rRNA in other tissues and cell lines were measured as a percent of the 
level of these genes in A549 RNA. The levels of 18S ribosomal RNA ranged between 82% and 

5 126% of the A549 internal control in all of the RNA samples, suggesting that there were similar 
amounts of RNA in the analyzed tissue samples. HDAC9 was detected at varying levels by real 
time PCR in a wide range of tissues (Fig. 8), confirming the Northern blot analysis (Fig. 7). In 
normal tissues, HDAC9 was detected at the highest levels in fetal brain (894%), cerebellum 
(538%), and thymus (589%). In tumor cell lines, HDAC9 was detected at the highest levels in 

10 SJRH30 cells (850%) (Fig. 8). These results suggest that HDAC9 is differentially expressed in 

some tissues at die RNA level. 

Example 7 :HDAC Enzyme Assay 

Preparation of HDAC9-flag. A flag epitope tag sequence was added to the 3* end of 

HDAC9vl by PCR. The PCR primers were S'-ACGCCGGATATCACATTGGT TCTGC-3' and 

15 5^CGGAATTCTTATTATTTATCATCATCAT(mTATAATCCCC 

GTCGACAGCCACCAGGTGAGGATGGCA -3'. The flag-tagged HDAC9vl was reconstructed 

using the EcoRV site in the 1 st primer and subcloned into the Xbal and EcoRI sites of human 

expression vector pCDNA3.1(-) (Invitrogen, Carlsbad, CA). 

HDAC activity assay. HDAC activity assays are performed as previously described 

20 (Emiliani, S., Fischle, W., Van Lint, C, Al-Abed, Y., and Verdin, E. (1998) Proc. Natl. Acad. 

Set U.SA. 95, 2795-2800). 5xl0 6 293 cells grown to 50% confluency in 100 mm dishes are 

transfected with 30 ug of C-terminally flag-tagged HDAC1, HDAC3, HDAC4, HDAC6, 

HDAC7, or HDAC9 using Geneporter transfection kit according to the manufacturers 
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instructions. The cell culture medium is changed 5 h after transfection. 48 h after transfection 
cells are washed in cold PBS and scraped into 1 ml of IP buffer (50mM Tris-HCl pH 7.5, 
120mM NaCl, 0.5mM EDTA, 0.5% NP-40) and incubated on a rocker for 20 min. Cellular 
debris is pelleted in a centrifuge at 14K for 20 min. The supernatant is precleared for 1 h with 
5 protein G beads (Pharmacia Biotech) in IP buffer. Immunoprecipitations are performed by 

incubating the precleared supernatant with either a-FLAG M2 agarose affinity gel (Sigma) for 2 
h at 4°C or anti-HDAC2 (Santa Cruz) for 1 h followed by incubation with protein G beads for 1 
h at 4°C The beads are then washed three times for 5 min in IP buffer and then washed three 
times in high salt IP buffer (50mM Tris-HCI pH 7.5, 1000 mM NaCl, OimM EDTA, 0.5% NP- 

10 40) at 4°C. IPS are then washed two times for 2 min in 1ml of HD-buffer (1 OmM Tris-HCl pH 
8.0, lOmM NaCl, 10% glycerol). When trapoxin inhibition is determined Ips are incubated with 
0.3, 3, 30 and 300 nM TPX in HD-buffer for 20 min. Supematants are incubated with 100000 
cpm substrate ([^-Ac(H41-24) SGRGKGGKGIX3KGGAK2tEIRKVLRD, in vitro/chemically 
acetylated using BOP-chemistry) in 30 ul HD-buffer or TPX in HD-buffer, resuspending the 

15 seph arose by gendy tapping die tube and shaking in an Eppendorf 5436 Thermomixer at fall 
speed at 37°C for 2h. 170 ul HD-buffer and 50ul stop-mix (1M HC1, 0.16M HAc) are added, 
vortexed for 15* min, 600ul ethylacetate is then added and vortexed for 45 minutes, then 
centrifuged at 14000g for 7 minutes. 540 ul of the organic (upper) phase is then counted in 5 ml 
scintillation liquid using conventional techniques. 

20 HDAC9 is catalytically active. In vitro histone deacetylase assays using 

immunopiecipiated HDAC9 and an ^-acetylated histone H4 peptide as substrate were 
performed to determine whether HDAC9 was catalytically active and to compare the activity of 
HDAC9 to known catalytically active HDAC1 , HDAC3, and HDAC4. An HD AC-telated protein 
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that lacks catalytic activity, HDRP/MTTR/HDACC was used as a negative control (Zhou, X., 

Richon, VM., Rifkind, R.A-, Marks, P.A. (2000) Identification of a transcriptional repressor 

related to the noncatalytic domain of his tone deacetylases 4 and 5. Proc Natl Acad Sci USA 97, 

1056-61). These results demonstrated that HDAC9 could deacetylate the histone peptide 

5 substrate at a level that was equivalent to HDAC3 and HDAC4 (Fig. 12A), while HDAC1 was 

more effective in this assay (Fig. 12B). 

Example 8 HDAC9 expression and cellular localization 

HDAC9 is expressed in vitro using 1 ug of the M6 clone, 2 ul of 35 S-Methionine and Sp6 

TNT Quick Coupled Transcription/Translation System according to manufacturer instructions. 

10 (Promega, Madison, WI). Proteins are electrophoresed on a SDS-PAGE gel according to 
conventional methods and visualized by a Storm phosphorimager. The complete HDAC9 
sequence molecular weight is estimated in silico as 72 kda using VectorNTI Suite software 
(Ihformax, North Bethesda, MD). A doublet was observed on a 10% SDS-PAGE gel. Doublets 
have also been observed when HDAC1 is translated in vitro. These doublets suggest that there is 

15 potentially a second translation initiation site. Furthermore, these results suggest that HDAC9 is 
an expressed gene. See Figure 13. 

1X1 0 5 Cos7 cells are plated onto chamber slides. Cells are transfected on the slides with 
2 ug of flag epitope-tagged HDAC9 or a cytoplasmically expressed protein (Ena-flag) using 
Geneporter2 in serum free medium according to the manufacturers instructions. The cell culture 

20 medium is changed 24 h after transfection. 48 h after transaction, cells are washed three times 
with PBS, fixed for 15 min. in 5% formaldehyde, washed two times in PBS, and blocked for 30 
minutes at room temperature in 10% fetal calf serum (Sigma) in PBS with 0.5% Triton-X-100 to 
permeablize the cells. The cells are washed again two times in PBS and then incubated with 25 
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mg/ml anti-Flag-FTTC conjugate for 1 hour. The stained cells are washed with PBS and 
photographed using fluorescence microscopy. 

HDAC9 is a nuclear protein. The translated HDAC9 peptide sequence predicts a 72 
Kda protein and this was confirmed by in vitro translation (Fig. 13A). hi order to determine the 
5 cellular localization of HDAC9, flag epitope-tagged HDAC9, Enabled (Ena) or pCMV4flag 
were transfected into Cos7 and 293 cells or cells were mock transfected without plasmid. The 
flag epitope was detected by fluorescence immunocytochemistry 48 h after transfection (Fig 
132?). Ena is a cytoskeleton-associated cytoplasmic protein substrate of Abl tyrosine kinase that 
transduces the axon-repulsive function of the Roundabout receptor during axon guidance 

10 (Gertler FB, Comer AR, Juang JL, Ahern SM, Clark MJ, Liebl EC, Hoffinann FM. (1995) 
enabled, a dosage-sensitive suppressor of mutations in the Drosophila Abl tyrosine kinase, 
encodes an Abl substrate with SH3 domain-binding properties. Genes Dev. 9, 521-533J3ashaw 
GJ, Kidd T, Murray D, Pawson T, Goodman CS. (2000) Repulsive axon guidance: Abelson and 
Enabled play opposing roles downstream of the roundabout receptor. Cell. 101, 703-715). As 

1 5 expected, Ena was detected in the cytoplasm, whereas HDAC9 was detected in the nuclei of 

these cells. The detection of HDAC9 in the nuclei of both Cos7 and 293 cells suggested that 

HDAC9 was predominantly a nuclear protein. 

Example 9: Id entificat ion of associated proteins in HDAC complexes 

Transfection. 1X10 7 Cos7 cells are transfected with 10 ug of either C-terminally flag 

20 epitope-tagged HDAC1, HDAC2, HDAC3, HDAC4, HDAC6, HDAC7, or HDAC9 in 

pCDNA3.1 expression vector or Flag vector or buffer (Mock) as transfection controls, by 

electroporation using a Gene Pulser II instrument (Biorad, Hercules C A) set at 0.3Kv/ 500 uF. 
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Immunoprecipitation. Immunopiecipitations are perfomied as described (Grozinger, C. 
M., Hassig, C. A., and Schieiber, S. L. 1999. Proc. Natl. Acad. Sci. USA 96, 4868-4873). Whole 
cell extracts are prepared 48h after transfection by scraping cells into JLB buffer (50 mM Tris- 
HCL, pH 8, 150 mM NaCl, 10% glycerol, 03% Triton-X-100) containing complete protease 

5 inhibitor cocktail (Boehringer-Mannheim). Lysis is continued at 4°C for 10 min. and then 
cellular debris is pelleted by centrifugation at 14K for 5 minutes. Supernatants are pre-cleared 
with Sepharose A/G-plus agarose beads (Santa Cruz). Recombinant proteins are 
immunoprecipitated from pre-cleared supernatant by incubation with a-FLAG M2 agarose 
affinity gel (Sigma) for 2 h at 4°C or anti-HDACl (Santa Cruz, Santa Cruz, CA) for 1 h at 4°C, 

10 followed by incubation with Sepharose A/G beads. For Western blot analysis, the beads are 

washed with MSWB buffer (50 mM Tris-HCl, pH 8, 150 mM NaCl, 1 mM EDTA, 0.1% NIM0) 
and the proteins are separated by SDS/PAGE. Western blots are probed with anti-flag M2 
(Sigma), HDAC1 (Santa Cruz ), anti-HDAC2 (Santa Cruz), anti-HDAC6 (Santa Cruz), anti-Rb 
(Pharmingen), or anti-mSin3A (Transduction Labs, Lexington, KY) 

15 HDAC9 associates with proteins in the mSin3A complex. Class I HDACs, but not 

class II HDACs were previously found to be associated with the mSin3A complexes. The core 
HDAC1 complex consists of HDAC1, HDAC2, RbAp46, RbAp48. This core complex has been 
found to associate with an mSin3 A complex that is involved in transcriptional repression through 
an Rb and E2F complex (Luo RX, Postigo AA, Dean DC.(1998) Rb interacts with histone 

20 deacetylase to repress transcription. Cell. 92, 463-473; Magnaghi-Jaulin L, Groisman R, 

Naguibneva I, Robin P, Lorain S, Le Villain JP, Troalen F, Trouche D, Harel-Bellan A. (1998) 
Retinoblastoma protein represses transcription by recruiting a histone deacetylase. Nature. 391 , 
601-605; Brehm A, MiskaEA, McCance DJ, Reid JL, Bannister AJ, Kouzarides T. (1998) 
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Retinoblastoma protein recruits histone deacetylase to repress transcription. Nature. 391, 597- 
601). In order to determine whether HDAC9 was a part of this complex, endogenous HDAC1, 
HDAC2, Rb, and mSin3 proteins were co-imunoprecipitated from cells transfected with flag- 
epitope tagged HDAC1, HDAC3, HDAC4, HDAC6, HDAC7or HDAC9. To assure that 

5 transfected flag epitope-tagged HDACs could be detected in cells, the levels of HDAC 

expression were detected by immunoprecipitation and Western blotting with antiserum to the 
flag epitope. To determine which HDACs associated with components of the Sin3 complex, 
endogenous proteins in the Sin3 complex were immunoprecipitated and the associated HDACs 
were detected by Western blotting flag epitope-specific antibody HDAC9 was found to associate 

1 0 with HDAC 1 , HDAC2., Rb, and mSin3 A, suggesting that HDAC9 is a component of an mSin3 A 
complex. 

HDAC9 associates with SMRT and NCoR. Since compressors SMRT and NCoR 
associate with the mSin3 core complex, experiments were performed to co-immunoprecipitate 
HDACs with NCoR and SMRT (Fig. 15). HDAC9 co-immunoprecipitated with both of these 

15 proteins, suggesting that HDAC9 associates with SMRT, and NCoR. Western analysis of the 
flag-detected blots with anti-NCoR indicated that NCoR was immunoprecipitated. As previously 
reported, SMRT co-immunoprecipitated with HDAC4 and HDAC6, and HDAC6 and HDAC7 
did not associate with the Sin3 A complex. 

HDAC9 associates with 14-3-3 and Erk proteins* HDAC4 was previously found to 

20 associate with 14-3-3-p, 14-3-3-8, CamK, Erkl, and Erk 2 proteins, which sequester HDAC4 in 
the cytoplasm and prevent phosphorylated HDAC4 and HDACS from entering the nucleus and 
repressing MEF2 activated transcription. In order to determine whether HDAC9 associate with 
these proteins, experiments were performed to co-immunoprecipitate HDACs with 14-3-3 and 
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Erie proteins. All of the HDACs tested associated with 14-3-3s and Erics. These results suggest 
that the association of HDACs with 14-3-3 and Erics might be a general mechanism of 
sequestering HDACs in the cytoplasm. 

Classification of HDAC9. HDACs have been classified by sequence similarity to yeast 
5 HDACs, sequence length, location of catalytic domain, cellular localization, associating proteins, 
and sensitivity to HDAC inhibitors. The data in this study suggests that HDAC9 has 
characteristics of both class I and class II HDACs. HDAC9 had sequence similarity with class II 
yeast hdal subfamily member Clr3 and HDAC6 catalytic domain 1 . In addition, the 3 Kb 
HDAC9 transcript was only detected in kidney and testis, suggesting that it might have a limited 

10 tissue distribution like class II HDACs. HDAC9 was between class I and class II HDACs in 
length. Class I HDACs average 443 bp in length, whereas class II HDACs average 1069 bp in 
length. However, HDAC9 was found to have an N-terminal catalytic domain, as opposed to the 
C-terminal domains that have been found in class II HDACs. HDAC6 is an exception that has 
both N-terminal and C-terminal catalytic domains. Furthermore, class I HDACs are nuclear 

15 proteins, while class II HDACs are nucelo-cytoplasmic. Immunocytochemistry indicated that 
HDAC9 was predominantly nuclear and was detected in a different subcellular compartment in 
comparison to the Ena protein that is expressed in the cyotplasm. In contrast to the 3 Kb HDAC9 
transcript that might be differentially expressed, a 3.5 Kb HDAC9 transcript that was identified 
by Northern analysis was expressed ubiquitously in normal tissues, tumor tissues and cell lines, 

20 similar to class I HDACs. In addition, HDAC9 was found to co-immunoprecipitate with proteins 
that were previously only associated with class I HDAC complexes, including HDAC1 , HDAC2, 
mSin3 A, and Rb. HDAC9 also has putative C-terminal LXCXE motifs that so far have only been 
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found in HDAC1. HDAC9 was also found to associate with NCoR and SMRT. This evidence 
suggests HDAC9 had characteristics that bridged those of class I and class II HDACs. 
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What is claimed is: 

1 . An isolated polypeptide comprising the amino acid sequence set forth in 
SEQ ID NO:l , SEQ ID NO 5 or SEQ ID NO 6. 

2. An isolated polypeptide consisting of the amino acid sequence set forth in 
SEQ ID NO:l, SEQ ID NO 5 or SEQ ID NO 6. 

3. An isolated DNA comprising a nucleic acid sequence that encodes the 
polypeptide of claim 1 or 2. 

4. A vector molecule comprising at least a fragment of the isolated DNA 
according to claim 3. 

5. The vector molecule according to claim 4 comprising transcriptional control 

sequences. 

6. A host cell comprising the vector molecule according to claim 5. 

7. The isolated DNA according to claim 3, comprising a nucleotide sequence 
selected from the group consisting of (1) the nucleotide sequence set forth in SEQ ID NO:2, 7 or 
8, being the complete cDNA sequence encoding the polypeptide as defined in claim 2; (2) the 
nucleotide sequence set forth in SEQ ID NO:3, being the open reading frame of the cDNA 
sequence encoding the polypypetide as defined in claim 2; (3) a nucleotide sequence capable of 
hybridizing under high stringency conditions to a nucleotide sequence set forth in SEQ ID NO 3 ; 
and (4) the nucleotide sequence set forth in SEQ ID NO:4, being the endogenous genomic 
human DNA encoding the polypeptide as defined in claim 2. 
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8. A vector molecule comprising at least a fragment of an isolated DNA 
molecule according to claim 7. 

9. The vector molecule according to claim 8 comprising transcriptional 
control sequences. 

10. A host cell comprising the vector molecule according to claim 9. 

11. A host cell which can be propagated in vitro and which is capable upon 
growth in culture of producing a polypeptide according to claim lor 2, wherein said cell 
comprises at least one transcriptional control sequence that is not a transcriptional control 
sequence of the natural endogeneous human gene encoding the polypeptide of claim 2, wherein 
said one or more transcriptional control sequences control transcription of a DNA encoding a 
polypeptide according to claim 1 or 2. 

12. A method for the diagnosis of a condition associated with abnormal 
regulation of gene expression which includes, abnormal cell proliferation, cancer, 
atherosclerosis, inflammatory bowel disease, host inflammatory or immune response, or 
psoriasis in a human which comprises: detecting abnormal transcription of messenger RNA 
transcribed from the natural endogeneous human gene encoding the polypeptide as defined in 
claim 2 in an appropriate tissue or cell from a human, wherein said abnormal transcription is 
diagnostic of said condition. 

13. The method of claim 12, wherein said natural endogeneous human gene 
comprises the nucleotide sequence set forth in SEQ ID NO:4, 7 or 8. 

14. The method of claim 12, comprising contacting a sample of said 
appropriate tissue or cell or contacting an isolated RNA or DNA molecule derived from said 
tissue or cell with an isolated nucleotide sequence of at least about 15-20 nucleotides in length 
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that hybridizes under high stringency conditions with the isolated nucleotide sequence as defined 
in claim 3. 

15. A method for the diagnosis of a condition associated with abnormal 
5 HDAC9 expression or activity in a human which comprises: 

measuring the amount of a polypeptide comprising the amino acid 
sequence set forth in SEQ ID NO:l, 5 or 6 or fragments thereof, in an appropriate tissue or cell 
from a human suffering from said condition wherein the presence of an abnormal amount of said 
polypeptide or fragments thereof, relative to the amount of said polypeptide or fragments thereof 
10 in die respective tissue from a human not suffering from said condition associated with abnormal 
HDAC9 expression or activity is diagnostic of said human's suffering from a condition 

16. The method of claim 15, wherein said detecting step comprises contacting 
said appropriate tissue or cell with an antibody which specifically binds to a polypeptide that 

15 comprises the amino acid sequence set forth in SEQ ID NO:l, 5 or 6 or a fragment thereof and 
detecting specific binding of said antibody with a polypeptide in said appropriate tissue or cell, 
wherein detection of specific binding to a polypeptide indicates the presence of a polypeptide 
that comprises the amino acid sequence set forth in SEQ ID NO:l, 5 or 6 or a fragment thereof. 

20 17. An antibody or a fragment thereof which specifically binds to a 

polypeptide that comprises the amino acid sequence set forth in SEQ ID NO:l , 5 or 6 or to a 
fragment of said polypeptides. 

18. An antibody fragment according to claim 17 which is an Fab or F(ab')2 

25 fragment 

19. An antibody according to claim 17 which is a polyclonal antibody. 

20. An antibody according to claim 17 which is a monoclonal antibody. 

30 



72 



WO 02/50285 



PCT/EP01/14928 



21 . A method for producing a polypeptide as defined in claim 1 or 2, which 
method comprises: 

culturing a host cell having incorporated therein an expression vector comprising 
an exogenously-derived polynucleotide encoding a polypeptide comprising an amino acid 
sequence as set forth in SEQ ID NO:l , 5 or 6 under conditions sufficient for expression of the 
polypeptide in the host cell, thereby causing the production of the expressed polypeptide. 

22. The method according to claim 21, said method further comprising recovering 
the polypeptide produced by said cell. 

23. The method according to claim 21 , wherein said exogenously-derived 
polynucleotide encodes a polypeptide consisting of an amino acid sequence set forth in 
SEQIDNO:l,5or6. 

24. The method according to claim 21 , wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID NO:2, 7 or 8. 

25. The method according to claim 21, wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID NO:3. 

26. The method accoding to claim 21 , wherein said exogenously-derived 
polynucleotide consists of the nucleotide sequence as set forth in SEQ ID NO 3. 

27. The method according to claim 24, wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID 
NO:4. 
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Figure I. 

1 GGCGCCGAGG CTTCTGCGTC CGTCGTGGTT CCTCGCTCCG 
4) GGGCGGAGTT CGCGATAGCG ATCGGGGAGC AGGACGCGGG 
82 GCGTGGACCC AGGTCCGAGG CGAGGAAGCC GTAACCCATG 
123 CGCGGGGAGC CTCCCCCTTC GACTGCAGCC TCGCTCCGTG 
164 CCTTCTGCGC GCCTGGGaTC CCGGAGCCTG CCTAGGTTCT 
205 GTGCGCTCCC GCCCAGGCCG GTGCCCGCCG CCCGCCTGCG 
246 CCCCAGGCAG GTCCCAGGCC TCCGGCTGCT CCCGGCCGAA 
287 GCCCCGAGTG CGAGATCGAG CGTCCTGAGC GCCTGACCGC 
328 AGCCCTGGAT CGCCTGCGGC AGCGCGGCCT GGAACAGAGG 
369 TGTCTGCGGT TGTCAGCCCG CG AGG CCTCG GAAGAGGAGC 
4 1 0 TGGGCCTGGT GCACAGAGTA (XTTTCACTG CGCGCGGCTG 
451 GCCGCAGGGG CTGGACTGCA GCTGGTGGAC GCTGTGCTCA 
492 CTGGA GCTGT GCAAAATGGG CTTGCCCTGG TGAGGCCTCC 
533 CGGGCACCAT GGCCAGAGGG CGGCTGCCAA CGGGTTCTGT 
574 GTGTTCAACA ACGTGGCCAT AGCAGCTGCA CATGCCAAGC 
615. AGAAACACGG GCTACACAGG ATCCTCGTCX3 TGGACTGGGA 
655 TGTGCACCAT GGCCAGGGGA TCCAGTATCT CTTTGAGGAT 
696 GACCCCAGCG TCCTTTACTT CTCCTGGCAC CGCTATGAGC . 
737 ATGGGCGCTT CTGGCCTTTC CTGCGAGAGT CAGATGCAGA 
778 CGCAGTGGGG CGGGGACAGG GCCTCGGCTT CACTGTCAAC 
819 CTGCCCTGGA ACCAGGTTGG GATGGGAAAC GCTGACTACG 
860 TGGCTGCCTT CCTGCACCTG CTGCTCCCAC TGGCCTTTGA 
901 GTTTGACCCT GAGCTGGTGC TGGTCTCGGC AGGATTTGAC 
942 TCAGCCATCG GGGACCCTGA GGGGCAAATG CAGGCCACGC 
983 CAGAGTGCTT CGCCCACCTC ACACAGCTGC TGCAGGTGCT 
1024 GGCCGGCGGC CGGGTCTGTG CCGTGCTGGA GGGCGGCTAC 
1065 CACCTGGAGT CACTGGCGGA GTCAGTGTGC ATGACAGTAC 
1 106 AGACGCTGCT GGGTGACCCG GCCCCACCCC TGTCAGGGCC 
1 147 AATGGCGCC 
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Figure 2. 



A. 

1 atggggaccjg cgcttgtgta ccatgaggac atgacggcca cccggctgct ctgggacgac 
61 cccgagtgcg agatcgagcg tcctgagcgc ctgaccgcag ccctggatcg cctgcggcag 
121 cgcggcctgg aacagaggtg tctgcggttg tcagcccgcg aggcctcgga agaggagctzg 
181 ggcctggtgc acagcccaga gtatgtatcc ctggtcaggg agacccaggt cctaggcaag 
241 gaggagctgc aggcgctgtc cggacagttc gacgccatct acttccaccc gagtaccttfc 
301 cactgcgcgc ggctggccgc aggggctgga ctgcagctgg tggacgctgt gctcactgga. 
361 gctgtgcaaa atgggcttgc cctggtgagg cctcccgggc accatggcca gagggcggct 
421 gccaacgggt tctgtgtgtt caacaacgtg gccatagcag ctgcacatgc caagcagaaa 
4 81 cacgggctac acaggatcct cgtcgtggac tgggatgtgc accatggcca ggggatccag 
541* tatctctttg aggatgaccc cagcgtcctt tacttctcct ggcaccgcta tgagcatggg 
601 cgcttctggc ctttcctgcg agagtcagat gcagacgcag tggggcgggg acagggcctc 
661 ggcttcactg tcaacctgcc ctggaaccag gttgggatgg gaaacgctga ctacgtggcfc 
721 gccttcctgc acctgctgct cccactggcc tttgagtttg accctgagct ggtgctggtc 
781 tcggcaggat ttgactcagc catcggggac cctgaggggc aaatgcaggc cacgccagag 
841 tgcttcgccc acctcacaca gctgctgcag gtgctggccg gcggccgggt ctgtgccgtcj 
901 ctggagggcg gctaccacct ggagtcactg gcggagtcag tgtgcatgac agtacagaccj 
961 ctgctgggtg acccggcccc acccctgtca gggccaatgg cgccatgtca gaggtgcgacj 
1021 gggagtgccc tagagtccat ccagagtgcc cgtgctgccc aggccccgca ctggaagagc 
1081 ctccagcagc aagatgtgac cgctgtgccg atgagcccca gcagccactc cccagagggcj 
1141 aggcctccac ctctgctgcc tgggggtcca gtgtgtaagg cagctgcatc tgcaccgago 
1201 tccctcctgg accagccgtg cctctgcccc gcaccctctg tccgcaccgc tgttgccctg 
1261 acaacgccgg atatcacatt ggttctgccc cctgacgtca tccaacagga agcgtcagcc 
1321 ctgagggagg agacagaagc ctgggccagg ccacacgagt ccctggcccg ggaggaggcc 
1381 ctcactgcac ttgggaagct cctgtacctc ttagatggga tgctggatgg gcaggtgaac 
1441 agtggtatag* cagccactcc agcctctgct gcagcagcca ccctggatgt ggctgttcgg 
1501 agaggcctgt cccacggagc ccagaggctg ctgtgcgtgg ccctgggaca gctggaccgg 
1561 cctccagacc tcgcccatga cgggaggagt ctgtggctga acatcagggg caaggaggccj 
.1621 gctgccctat ccatgttcca tgtctccacg ccactgccag tgatgaccgg tggtttcctg 
1681 agctgcatct tgggcttggt gctgcccctg gcctatggct tccagcctga cctggtgctcj 
1741 gtggcgctgg ggcctggcca tggcctgcag ggcccccacg ctgcactcct ggctgcaat.3 
1801 cttcgggggc tggcaggggg ccgagtcctg gccctcctgg aggagaactc cacaccccag 
1861 ctagcaggga tcctggcccg ggtgctgaat ggagaggcac ctcctagcct aggcccttco 
1921 tctgtggcct ccccagagga cgtccaggcc ctgatgtacc tgagagggca gctggagcct 
19B1 cagtggaaga tgttgcagtg ccatcctcac ctggtggctt ga 

B. 

MGTAlrVYHED MTATRliLWDD PECEIERPER LTAALDRLRQ RGLEQRCLRIj S AREAS E EEL 
GLVHSPEYVS LVREXQVLGK EELQALSGQF DAIYFHPSTF HCARLAAGAG LQLVDAVLTG 
AV0NG1ALVR PPGHHGQRAA ANGFCVFNNV A3AAAHAKQK HGLHR I liWD WDVHHGQGIQ 
YLFEDDPSVL YFSWHRYEHG RFWPFliRESD ADAVGRGOGL GFTVNLPWNQ VGMGNADYVA 
AFLHliLLPlA FEFDPELVLV SAGFDSAIGD PEGQMQATPE CFAHI/TQIiLQ VLAGGRVCAV 
liEGGYHLESli AESVCMTVQT LLGDPAPPLS GPMAPCQRCE GSALESIQSA RAAQAPHWKS 
LQQODVTAVP MSPSSHSPEG RPPPLLPGGP VCKAAASAPS SLLDQPCIiCP APSVRTXVAL 
TTPDITLVLP PDVI00EASA LREETEAWAR PHESXAREEA LTALGKLl-YL L.DGMLDGQVN 
5GIAATPASA AAATLDVAVR RGLSHGAQRL LCVALGQliDR PPDLAHDGRS LWLNIRGKEA 
AALSMFHVST PLPVMTGGFli SCILGLVLPL AYGFOPDLVL VAIjGPGHGLQ GPHAALLAAM 
LRGLAGGRVIj ALLEENSTPQ LAGILARVLiN geappslgps svaspedvoa lmylrgqlep 
QWKMLQCHPH LVA 
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Figure 3, 



AL022328 vs HDAC9 : 



AL022328 

HDAC9 
AL022328 

HDAC9 
70,022328 

HDAQ9 
AL022328 

HDAC9 
AL022328 

HDAC9 
AL02232B 

HDAC9 
AL022328 

HDAC9 
AL022328 
. HDAC9 
AL022328. 

HDAC9 
AL022328 

HDAC9 
AL022328 

HDAC9 
AL022328 

HDAC9 
AL02232B 

HDAC9 
AL022328 



2 tcaagccaccaggtgaggatggcactaca ctcacctgcaacatct 

iiiiiiiiniiiiiiiiiii!ii<«« m <««iiiiiiiiii! 

1 tcaagccaccaggtgaggatggca . . , . . ctgcaacatct 

1 81 tccactgaggctccagctgccctctcaggtacatcagggcctggacgtcc 

iiiiiiiiiiiMiiiiiiiiiiiiiiiiiiiiniMiiiiiiiiitii 

36 tccactgaggctccagctgccctctcaggtacatcagggcctggacgtcc 
231 tctggggaggccacagaggaagggcctaggctaggaggtgcctctccatt 

I.I1II1IIUI1II iiiiiiiiiiiiiiiiiiiitiiffiiiiiiiiii i 

86 tctggggaggccacagaggaagggcctaggctaggaggtgcctctccatt 



281 cagcacccgggccaggatccctgctagctggggtgtggagttctgga . 

i ) 1 1 } 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 f f 1 1 1 1 1 1 1 

136 cagcacccgggccaggatccctgctagctggggtgtggagtt . 



82 



322 . ct tacctcct ccaggagggccaggactcggcccccrtgccagcccccgaa 

.«««lllllllllllllllllllllllllllllllll.lllllllllll 

177 ctcctccaggagggccaggact.cggccccc£gccagccc©cgaa 

449 gca t tgcagccaggagt gcagcgt gggggccctgcaggccatggccaggc 

irimMmiimmiMmiiimmiiiiimiiiiim 

222 gcattgcagccaggagtgcagcgtgggggccctgcaggccatggccaggc 
4 99 cccagcgccaccagcaccaggtcaggctggaagccataggccaggggcag 

iimiiiinm minimi imimmimmiiiii ii 

272 cccagcgccaccagcaccaggtcaggctggaagccataggccaggggcag 



549* 



caccaagcccaagatgcagctcaggaaaccaccggtcatctgtg. • . . , 

1 1 i I I 1 1 I 1 1 I 1 1 I I 1 1 1 1 1 1 1 1 I 1 I S 1 1 1 1 1 1 1 1 I I I I 204 

322 caccaagcccaagatgcagctcaggaaaccaccggtcat 



587 tcaccactggcagtggcgtggagacatggaacatggatagggcagccgcc 

«««« 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 1 1. 

360 . . . .cactggcagtggcgtggagacatggaacatggatagggcagccgcc 

838 tccttgcccctgatgttcagctacagactcctccttcc cctaccc 

lllllllllllllllllllllllllllllllll«<« ««<ll 
4 07 tccttgcccctgat gtt cagccacagactcctc cc 

1175 gtcatgggcgaggtctggaggccggtccagctgtcccagggccacgcaca 

iiiiiiiiiiiimiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

442 gtcatgggcgaggtctggaggccggtccagctgtcccagggccacgcaca 



1225 gcagcctgga 

mil««< 139 
492 gcagc 



.cttacctctgggctccgtgggacaggcctctccga 

t n 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 1 

.ctctgggctccgtgggacaggcctctccga 



1399 acagccacatccagggtggctgctgcagcagaggctggagtggctgctat 

mmiimmmmimmmiiimiimmiiim 

527 acagccacatccagggtggctgctgcagcagaggctggagtggctgctat ' 



1449 accactgttcacctgtg..». 
IIIIIIIIIIH<«« 725 



cccacctgcccatccagcatcccatcta 

<««iimiiinmmmnn 



180 

35 
230 

85 
280 
135 
322 
177 
448 
221 
498 
271 
548 
321 
587 
360 
837 
406 
1174 
441 
1224 
491 
1398 

526 
1448 

576 
2208 
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HDAC9 


577 


AL.022328 


2209 




612 






HDAC9 


662 


Ab022328 


24 07 


HDAC9 


697 


AL022326 


2457 


HDAC9 


747 


AL022328 


2507 


HDAC9 


797 


AL022328 


2557 


HDAC9 


847 


AL022328 


2607 


HDAC9 


897 


AL02232B 


2722 


HDAC9 


933 


AL.022328 


2772 


HDAC9 


583 


AI_r02Z32B 


2669 




j. Mx y 


fc.* XI 9 9 "4 9 R 


oqiq 










HDAC9 


2136 


AL022328 


3193 


HDAC9 


1154 


Ali022328- 


3243 


HDAC9 


1204 


AL022328' 


3358 



577 accactgttcac. 



.ctgcccatccagcatcccatcta 



agaggtacaggagcttcccaagtgcagtgagggcctcctcccgggccagg 

ii n i i i i n ii mi 1 1 1 ti i li ii in 1 1 1 1 1 f it ii i ii in 1 1 1 ii 

agaggtaceggagcttcccaagtgcagtgagggcctcctcccgggccagg 
gactcgtgtggcctgtg cccacctggcccaggcttctgtctcctc 

iiiiiiimn<«« «««iiiiiiiiiiiiiniimi ii 

gactcgtgtggc . . .ctggcccaggcttctgtctcctc 

cctcagggctgacgcttcctgttggatgacgtcagggggcagaaccaatg 

iiiiiiiiiimiiimiiiiiiiiiniiiiimiiiiiiiiii n 

cctcagggctgacgcttcctgttggatgacgtcagggggcagaaccaatg 
tgatatccggcgttgtcagggcaacagcggtgcggacagagggtgcgggg 

llllllllliMllllllllllilllllllllllllllllllllllll II 

tgatatccggcgttgtcagggcaacagcggtgcggacagagggtgcgggg 
cagaggcacggctggtccaggagggagctcggtgcagatgcagctgecfct 

1 1 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 

cagaggcacggctggtccaggagggagctcggtgcagatgcagctgcctit 
acacactggacccccaggcagcagaggtggaggcctcccctctggggagt 

n 1 1 1 1 1 1 1 1 1 u 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 1 1 ii i i.i 1 1 1 1 ii 1 1 1 1 1 ii 

acacactggacccccaggcagcagaggtggaggcctcccctctggggagt 



iiiiiimiiiiiiinmiiiiuiiiiii«««« 7 9 .«««i u 

ggctgctggggctcatcggcacagcggtcacat ctt 

gctgctggaggctcttccagtgcggggcctgggcagcacgggcactctgg 

iiiiiiililiiiiiiiiiiiiiiiliiliiililiiiiiiiniiii ii 

gctgctggaggctcttccagtgcggggcctgggcagcacgggcactcfcgg 

atggactctagggcactgtg. • . .cctacctcccctcgcacctctgacat 
| | | | | | | | | | | | | ||<««< 61 ««<| | | | | | | I | | | I I I I I I I I I I 

atgga ctct agggca ctcccctcgcacct ctga cat 



IIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII I II 

ggcgccattggccctgacaggggtggggccgggtcacccagcagcgtctg 
tactgtcatgcacactgactccgccagtgactccaggtggtagccgccct 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii i« 

tactgtcatgcacact^actccgccagtgactccaggtggtagccgcc «, . 



999- 
<<< 



gtcacctccegcacggcacagacccggccgccggccagcacc 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiin) 



189 ««< 



tgcagcagctgtgtgaggtgggcgaagcactctggcgtggcctgcatttg 

I I I I I I I I I M 1 1 1 1 1 I I I I M 1 1 1 1 M I I I M I I 1 1 1 1 I I I I I I II I || 

tgcagcagctgtgtgaggtgggcgaagcactctggcgtggcctgcatttg 

cccctgga.. . .ctcacctcagggtccccgatggctgagtcaaatcctgc 

in««« " ««<i i milium inn i iii i iii mi in 

ccc ctcagggtccccgatggctgagtcaaatcctgc 



cgagaccagcaccagctcagggtcaaactaca; 



.gtcacctcaa agg 



.611 
2258 
661 
2406 
696 
2456 
746 
2506 
796 
2S56 
646 
2606 
896 
2721 
932 
2771 
982 
2868 
1018 
2918 
1068 
2966 
1116 
.3192 
1153 
3242 
1203 
3357 
1239 
3604 
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HDACS 
AL022328 

HDAC9 
AL022328 

HDAC9 
AI.022328 

HDACS 
AL022328 

HDACS 
AL022328 

HDACS 
At.022328 

HDACS 
AL022328 

HDACS 
JO.02232B 

HDACS 
AL02232B 

HDACS 
AL022328 

HDACS 
AL022328 

HDACS 
AL022328 
HDACS 
AL022328 
. HDACS 
AL022328 
HDAC9 



IIIMMIIIIMIIIIIIIMMMI<«« 212 <««|||||| || 

1240 cgagaccagcaccegctcagggtcaaa " ctcaaagg 

3 605 ccagtgggagcagcaggtgcaggaaggcagccacgtagtcagcgtttccrc 

iiiimimiiiiiiiiiiiiiimiiiiiiriiiiiiiiiiiii ii 

1275 ccagtgggagcagcaggtgcaggaaggcagccacgtagtcagcgtttccc 
3655 atcccaacctggc ggcacctggttccagggcaggttgacagtgaa 

imiin««< is? ««<iiiiiiiiiiiimiiiiiiiin ii 

1325 atcccaac ctggttccagggcaggttgacagtgaa 

3 84 9 gccgaggccctgtccccgccccactgcgtctgcatctgactctcgcagga 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ii 

1360 gccgaggccctgtccccgccccactgcgtctgcatctgactctcgcagga 
3 8SS aaggccagaagcgcccatgctcatagcggtgccaggagaagtaaaggacg 

1 1 II 1 1 1 1 1 1 1 1111 1 1 1 1 1 1 M I II II III 1 1 1 1 1 1 IINII 1 1 1 1 I II 

1410 aaggccagaagcgcccatgctcatagcggtgccaggagaagtaaagga eg 
3S4 8 ctgee ct cacct ggggt ca tcctcaaagagatactggatcccc t:g 

«<« iBo ««<| KM I II I II llllll I II II HUM II 1 1 I II 



145S 



.ctggggtcatcctcaaagagatactggatcccctg 



4164 gccatggtgcacatcccagtccacgacgaggatcctggg cacacc 

IIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIII«<« «6 <«-=<! 
14S5 gccatggtgcacatcccagtccacgacgaggatc c 

4355 tgtgtagcccgtgtttctgcttggcatgtgcagctgctatggccacgtLtg 

iiitiitiiiiiiiiiiiiiiiiittiiiiiiiffiiiiiiiiiiii i ii 

1530 tgtgtagcccgtgtttctgcttggcatgtgcagctgctatggccacgtrtg 
4 4 05 ttgaacacacagaacccgttggcagccgccctctggccatggtgcccggg 

Miiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiii i ii 

1580 ttgaacacacagaacccgttggcagccgccctctggccatggtgcccggg 
4455 aggectacg. - - . ctcacctcaccagggcaagcccattttgcacaget.ee 

iiii««< sa <«<<<i i ii 1 1 m iii mi i milium 1 1 1 1 

1630 aggc *. .ctcaccagggcaagcccattttgcacagctcc 

4 58S agtgagcacagcgtccaccagctgcagtccagcccctgcggccagcegcg 

1 1 ti in i 1 1 in ii i in i tut ilium ii mmiitii ti in 

1666 agtgagcacagcgtccaccagctgcagtccagcccctgcggccagccgcg 



4 639 cgcagtgaaaggtactctgtg. 

mmimmn»««« 

1716 cgcagtgaaaggtact 



266 



, cgcaccgggtggaagtaga tggcg 

««««immmmm in 

cgggtggaagtaga tggcg 



4 940 tcgaactgtccggacagcgcctgcagctcctccttgcctaggacctgggt 

llllllllllllllllllllllllllllllllllllllllllllll.l Ml 

1751 tcgaactgtccggacagcgcctgcagctcctccttgcctaggacctgggt 



4 SSO ctccctgaccaggga tacat actctgggctgca 

immiiimmimiiimii<«« 

1801 ctccctgaccagggatacat actctggg 



, ♦ . . .ctgacctgtgca 
247 «««|||| IN 
ctgtgca 



5272 ccaggcccegctcctcttccgaggcctcgcgggctgacaaccgcagacac 

' I M M M Ml M M U H M I III M I M II 1 1 1 1 Ml I II 1 1 M 1 1 IM 

1836 ccaggcccagctcctcttccgaggcctcgcgggctgacaaccgcagacac 



1274 

3654 

1324 

3848 

135S 

38S8 

140S 

3948 

145S 
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14S4 

4354 

1529 

4404 
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4454 
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1750 

4S89 
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5271 
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5321 
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AL022328 5322 ct ctgt tccaggccgcgctgccgcaggcgatccagggctgcggtcaggcg 5371 

iimimiiimiimiimiiimiiiiimiiiiijiiiii 

HDAC9 1866- ctctgttccaggccgcgctgccgcaggcgatccagggctgcggtcaggcg 1935 
AL022328 5372 ctcaggacgctcgatctcgcactcggggctggg. . . .cttactcgtccca 5475 

IIIMIIIIIIIIIIIIIIMIIIMII««< 68 «««|||||||| 

HDAC9 1936 ctca'ggacgctcgatctcgcactcgggg tcgtccca 1971 

AL022328 5476 gagcagccgggtggccgtcatgtcctcatggtacacaagcgcgg 5519 

1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 ) 1 1 1 1 1 1 1 1 1 1 s 1 1 1 1 i t 

HDAC9 , 1972 gagcagccgggtggccgtcatgtcctcatggtacacaagcgcgg . 2 015 . 
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Figure 4, 
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Alignment results 



Sequence 
• Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Sequence 
Start of 
Aligning. 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 
Sequences 



format is Pearson 



1^ HDAC1 
2 **• HDAC2 
3: HDAC3 
4: HDAC8 
5s HDAC4 
6: HDAC5 
7s HDAC6 
8: HDAC7 
9: HDAC9 



PaJ.rwlse alignments 



. 4B2 aa 
468 aa 
428 aa 
377 aa 
10B4 aa 
1122 aa 
1122 aa 
855 aa 
673 aa 



(1:2) Aligned 
(1:3) Aligned 
(1:4) Aligned 
(1:5) Aligned 
(1:6) Aligned 
(1:7) Aligned 
(1:8) Aligned. 
(1:9) Aligned. 
(2:3) Aligned. 
(2:4) Aligned. 
(2:5) Aligned. 
(2:6) Aligned. 
(2:7) Aligned. 
(2 r8) Aligned. 
(2:9) Aligned. 
(3x4) Aligned. 
(3:5) Aligned. 
(3:6) Aligned. 
(3:7) Aligned. 
(3:8) Aligned. 
(3:9) Aligned. 
(4:5) Aligned. 
(4:6) Aligned. 
(4 s 7) Aligned. 
(4:6) Aligned. 
(4:9) Aligned. 
(5:6) Aligned. 
(5:7) Aligned. 
(5:8) Aligned. 
(5:9) Aligned. 
(6:7) Aligned. 
(6:8) Aligned. 
(6:9) Aligned. 
(7:8) Aligned. 



Score 
Score 
Score 
. Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score 
Score: 



82 

57 

38 

18 

14 

14 

15 

14 

55 

39 

13 

15 

15 

14 

15 

37 

12 

13 

13 

15 

15 

21 

16 

16 

20 

22 

59 

59 

49 

21 

100 

43 

19* 

43 
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Sequences (7:9) Aligned. Score: 19 
Sequences (8:9) Aligned. Score: 20 

Guide tree file created: l/bioinf nv/sof tware/biobenchsw/tmp/align/1478 .dnd] 

Start of Multiple Alignment 
There are 8 groups 
Aligning... 



Group 1: 


, Sequences: 


2 


Score: 24 259 


Group 2, 


: Sequences : 


3 


Score: 184 15 


Group' 3 


i Sequences: 


4 


Score: 12 882 


Group 4 






Delayed 


Group 5 


: Sequences: 


2 


Score: 984 7 


Group 6 


: Sequences: 


3 


Score : 7569 


Group 7 


: Sequences: 


4 


Score: 5689 


Group 8 


: Sequences: 


8 


Score: 2841 



Sequenced Score: 34 52 

Alignment Score 36872 

CLUSTAL- Alignment file created I /bioinf nv/sof tware/bi obenchsw/tmp/align/14 78 .out] 
CLUSTAL W 11.81) multiple sequence alignment 



MNS PNESDGMS GR E P S L E I LP RT S LH S 2 P VTVEVK PVLPRAMP S S MG GGGGGS PS PVEXJl 
MNSPNESDGWSGREPSLElLPRTSLHSIPVT^EVRPVLPRA>^SSMGGGGGGSPSPVEIiR 
MSSQSHPDGLSGRt^PVELIJn^VNHMPSTVDVATAX^ 

^ -MDLRVGQRPFVEPPP 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
HDAC8 
HDAC9 



KDAC5 GALVGSVDPTLREQO^OOEl-l^tKQgOOl^ROl'LFAErOXOHDHLTRQHEVOMKHlAKOQ 

HDAC6 GALVGSVDPTLREOOLOOELlJU,Rgg001^X0I^FAEFOXQHDI^^ 

HDAC4 • VAXPAliREgQLQQEIXAIJCQXOOIOROlI'lAEFQROHEOI/SROHEAOXAEHIXQO 

HDAC7 - EPTLLAl^RPQRLHHHLFLAGLQ QQ 

HDAC1 * 

HDAC2 — ' : — 

HDAC3 - 

HDAC8 . 

feDAC9 r — 



HDAC5 OEMLAARQQQEMLAAJUIQ^ELEOQROREQQROEELERQRLEQOLIjJ lrnx EKSKES AI AS 

KDAC6 OEMLAAR 0Q0EMLAAXR QQE LE QQRQREQQROE E LEXQRLE Q QLLI LRKKEK SKE S AI AS 

HDAC4 OEMLAMRH0OELXEHQR- - RLERHRQ EQELEKQHRE O^LOQLKNKEKGXES AVAB 

HDAC7 RSVEFMRLSMDTP- - - MPELQVGPOBOELROLLHKDRSXRSAVAB 

HDAC1 — - ... 

HDAC2 — — 

HDAC3 r — 

HDAC8 — 

HDAC9 • — 



HDAC5 TEVXLRLOEFLLSXSREPTPGGI^SIJ > 0HPXCWG--AHHASLBgSSPPgSGPPGTPPSY 

HDAC6 TEVRLRL0EFLI»SRSREPTPGGliNHSLP0HPKCWG--AKHASIjD0SSPPCSGPPGTPP8Y 

HDAC4 TI^nCMKLOEFVXNKRRAlAHRNLNBCI SSDPRYWYGXTgHSSLDQSSPPgSG VST8T 

HDAC7 SWKQK1AEV3 LKKQQAALERTVHPNSPGIP YRTLEP- LETEGATRSMLSSF 

HDAC1 

HDAC2 — . 
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HDAC3 
HDACB 
HDACS 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
HDACB 
HDAC9 



KLPLP6-PyDSW}DFPl^TASEPm*KVRSR1^0KVAERRSSPLLRIUa)GTVISTrKXRA 
KLPLPG-PYI)SRDDrPLRKTASEPmJ^>mSRWC0KVAI3UlSSPLLRIUa)GTV2STrK3aUl 
>TOPVl»G-)rmWU)DFPLRKTASEPmJU,RSRIJW^ 

LPPVPSLPSDPPEHFPLRRTVSEPNlJtLRyXPK-KSLXRKRNPLLRKE- - S APPSLRRRP 



HDAC5 
HDAC6 
KDAC4 
HDAC7 
HDACX 
HDAC2 
HDAC3 
HDAC8 
HDAC9 



VElTGAGPGASSVCNSAPGSGPSSPN-SSHST3AENGraGSVPOTPTEI^POHRAl.PIJ>fi 
VE3TGAGPGASSVCNSAPGSGPSSPN- SSHSTX AENGFTGSVFKI PTCHXjPQBKA2J>LD8 

UJVT DSACS5APGSGP65PNHSSGSVSAENG3APAVPSIPAETSLAHRl*VAREO 

ACTING - • -DSSFSSSSTFASGCSSPNDSEBG- 7 



HDAC5 
HDAC6 
BDAC4 
HDAC7 
HDACX 
HDAC2 
HDAC3 
KDAC8 
HDAC5 



SPK0FSLyXSPSLPN3SlX;i-0ATVTVTNSHLTAEPKX,STOCEAERCAl*0SX-Il0GGTI*TOT 
SPK0FSLyTSPSI^mSl*GLOATVTVTNSHLTASPKl>STO0EAEROAl^SI^ROGGTI>TGK 

SAAPLPLYTSPSLPmTLGliPATG • • •PSAGTAGQODTERl*TI>PAX_»QQR~ -LS- - 

PNP1LG — -----------DSDRRTHPTI*GPRG*— - 



HDACB 
HDAC6 
HDAC4 
HDAC7 
HDACX 
KDAC2 
HDAC3 
HDAC8 
BDAC9 



FMSTSS3 PGCLLGVALEGDGSPBGHASLWHVLLLEOAROOSTLXA VFLHGQSF 

FMS T S S X P GC1»L»GV ALEGDG S PBGHASI*I»QHVIjLL E<?AR QQS Tt»X A - - « - -VPLHGQSP 
1*F P GTHLTP Y1.STS PLERDGG - AAHS PL1»0HWV1XE0PP AQAPl^VXGl* - - GALPU1AQS- 
P21rGSPHTPI*FLPHGLEPEAG- GTl»PSRl*QP3 iLU>PSGSHAPI*l.TVPGI#GPI*PFHPAOS 



HDAC5 
HDAC6 
KDAC4 
HDAC7 
HDACX 
HDAC2 
KDAC3 
KDAC8 
HDAC5 



LVTGERVATSMRTVG3O.PIIHRPLSRT0SSPLP0S POALOOLVMpOOHOOFIjEXOXQ- — - 
tVTGERVATSMRTVGKLPRHRPLSRTOSSPLPOSPOALCOLVMOOCHOOFLEXQKQ-*-- 

LVGADRVSPSIB KU<0HRPl^RT0SAPLP0NAOAl-0HtVX0O0H0OFl*EICHK0OP0Q 

LMTTERI»S GSGLHWPLSRTRSEPLPPSATAPPPPGPMQPRLEOXiXTHVQ—- - 
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hdacs oti*olgkiltktgelproptthpeeteeelteooevi.lgegaltmpri:gstesestoedi, 

hdac6 qql.qlgx 3 ltxtg e lprqptthpeetee e1*teq0ev1*1»gegai*tmpregs tes es tqedz, 

hdac4 q0m?mnx3 2 pxpseparqpeshpeeteeelrehq- ai>ldepyu>rlpgqx£aha0agvqv 

hd ac? - - - « v3 xrsaxpsexprlrq3 psaed1*etdgggpgqwddg1*ehrelghgqpeargpapl 

HDAC1 --r 

KDAC2 

HDAC3 - 

HDACB 

HDACS 



KDAC5 EEEDEEEDGEEEElXriOVXDEEGESGAEEGPDIJ?EPGAGYKKl*FS&AQPZ*QPZ*£VYQAFL 

KDAC6 EEEDEEElX3EEEEPC20VKI>EEGESGAEEGPI>l,EEPGAGyKKLFSr^OPl*OPX>gVyOAPI. 

HDAC4 XQEP3 ESDEEEAE PPREVEPGQHQ * PSEQELLFRQQALt.l»EOQRI HQl»RNYQ ASM 

HDAC7 OOHPOVLl^WEOOJl 1-AGRLPRGSTGI>TV1,LPIJVQGGHRPLSRAQ- S SPA 

HDAC1 

HDAC2 

KDAC3 . * 1 

HDACB 1 — . 

HDACS : _ 



HDAC5 S1*ATVP HQALGRTQS S PAAP GGMK5 PPDQP VXHI*- FTTG>T\7Tn>TFMXiXHQCMCGIt 

HDAC6 SX*ATVP HQAIjGRTOS S P AAP GGMXS PPDQP VKBIj^ FTTGVVYI>TFMX*XHOCMCGH 

HDAC4 EAAGlPVSrGGHRPLSRAOSSPASATFPVSVOEPPTRPR-FTTGLVroTLMl-KHQCTCGS 

HDAC7 APASliS APEPASOARVLSSSETPARTLPFTTGI.iy2>SVMl-XHOCSCGl> 

KDAC1 MAOTOG-TRRXVcyyyDGPVGWrsrGQ 

KDAC2 MAYSOGGGKKKVCYYYDGDlGNrryGQ 

KDAC3 MAXTVAYFYDPDVGNFHYGA 

HDACB MEEPEEPADSGQSLVPVTI YSPEYV^MCD 

HnAC9 - - WGTAXVYHEDMTATJUAKDD 



HDAC5 THVHPEHAGR3 0S3WSRTU0XTGljl*SKCER2RGRJ^TLDEIQTVHSEyHTl*t»yGTSPI«NR0 

HDAC6 THVHPEHAGR3 QS2 WSRL0ETGLLSXCER2 *GRXATJ^E2eTVBSrYHTZ*Z,YGTSF;LNRQ 

HDAC4 SSSBPElU»GR105IWSRLtfErGlJ*GRCECJRGRXATl^E^ 

HDAC7 NSRHPEHAGR3 OS3 WS R l^OERGLRS QCE CLRGRXAS LEELQSVHS ERHVLX/Y GTNFLSRL 

KDAC1 G- - HPKXPHR3 RMTHNl > L.X*Ky GLYRKME3 YRPHXANAEEHTXYBSDDYI KF1»RS I RPDKH 

KDAC2 G-- HFMXPHRI RMTHNL1J-NYGLYRXMEI YRPKXATAEEHTKYHSDEYIKF2«R52 RPDNM 

KDAC3 G- - HPKRPMRLAXTH.Sl/VUry GLYXXWI VFXPYpASOHDMCRFBSEDYIDFl-QRVSPTNM 

HDACB S-- LAK I PRRASMVHSL2 EAYALHXQMR3 VKPX VA SUE EMATFBTDA YL»QH XtQKV S QEGD 

HDAC9 PECE3 ER P E R1»TAALDR1>RQR G1>E QR C1»K1*S AREAS EEElrGliVHS PEYVS IjVRETQVLGK 

HDAC5 XLDSKXLLGP3 SQKMYAVLPCGG2 GVDSDTVWNEMH SSS AVRHAVGCUJSX^AFKVAAGBL 

HDAC6 K LD S XXL LGP 3 SOXMYAVLPCGG3 GVDSDTVWNEMHSSSAVRMAVGCLLEIaATKVAAGBL 

KDAC4 RLPSXXI^SlJlS-VFTOLPCGGVGVDSOTIWNEVBSAGAAHIJkV^ 

HDAC7 XI^NGXIAGLlJiORfcFEI^PCGGVGVOTPTIWOT 



HDAC1 SB -YSKOMORrNVGEDCPVFPGLFEFCOLSTGGSVASAVKLNKOOTDJAVWM 

HDAC2 SE YSXOMHJ FNVGEDCPAFPGLFEFCOLSTGGSVAGAVXLKROOTDMAVHH 

HDAC3 OG - FTRSLNAFNVGDI>CPVFPG1*FEFCSRYTGAS1jOGATQLNNK3 CDIAINW 

HDAC8 E>D- HPDS3E - YGLGYDCPATEG3 FDYAAA3 GGAT2 TAAQCLIE>GMCKVAIHW 

. HDACS EE - - 1>QALS GQFDA 3 Y FHP STFHCARLAAGAGLQ L VDAVLTGAV 



HDACB 
HDAC6 
HDAC4 



XHGFAI 3 RPPGHHAEESTAWGFCF FNSVA1 TAXLLOOX- - - LNVGXVLI VDWD3 KHGNGT 

XNGFA3 3 RFPGHHAEESTAMGFCFFHSVA2TAKXLOQK LNVGKVLI VX>V7D1 KHGNGT 

KNGFAWRPPGHHAEESTPMGFCYFHSVAVAAKLLQOR LS VS XI LI VT> WD VHHGNGT 
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HDAC7 
HDACl 
HDAC2 
HDAC3 
HDAC8 
HDACS 



HDAC5 
KDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
HDAC8 
HDACS 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
KDAC6 
HDAC5 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDACl 
HDAC2 
HDAC3 
KDAC8 
HDAC9 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDACl 
KDAC2 
HDAC3 
HDAC8 
HDACS 



KDAC5 
HDACS 
KDAC4 
HDAC7 
HDACl 
KPAC2 
HDACS 



K^GFAVVRPPGHHADHSTAMGrCrFNSVAlACIlOl^OOSIUvSIU^SRjlilVDVTOVHH^GT 

AGG- LKRAXKSEASGFCYVHD3 VIA 3 LELLXY HQRVXYI DI DI HHGDGV 

AGG- LHHAXXYEASGFCYVNDJ VLAI LELLKY HORVLYI DI D3 HHGDGV 

AGG LHHAXKFEASG FCYVND3 V3G3 LELLXY HPRVLYI DI DI HHGDGV 

SGG- WHHAXKDEASGFCYLNDAV1X53 LRLRRX- FERI LYVDLDLHHGDGV 

QKG1-ALVRPPGHHG0RAAANGFCVFJWVAI AAAHAXQKHG- - - LHRI XjWDWDVHHGQGI 

OQAFYHDPSVliYI SLHRYDNGNFFPGS- - GAPEEVGGGPGVGYKVHVAWTGGVDPP1 GDV 

OOA tyiwps vi>y i s lhr ydngnffpgs - - gapeevgggpgvgyhvnvawtggvdppi gdv 

OOAFYSDPSVXYMSLHRYDDGNFFPGS- - GAPDEVGTGPGVGFNVKMAFTGGLDPPMGDA' 
OQTFYQDPSVXYl SLHRHDDGNFFPGS- - GAVDEVGAGSGEGFNVKVAWAGGDDPPMGDP 

EEAFYTTDRVWTVSFHRYG- - EYFPGT- - GDLRDI GAGKGKYYAVKYPLRDGID DB 

EEAFYTTDRVMTVSFHKYG- • EYFPGT- »GDLRD3 GAGXGXYYAWFPMCDGID DB 

OEAFYLTDRVWTVSFHRYGN- YFFPGT- -GDMYEVGAESGRYYCLNVFLRDGXD DQ 

EDAFSFTSKVMTVS1*HRFSP* GFFPGT- - GDVSDVGLGRGRYYSVNVP3 QDGIQ- - - -DB 
OY1>FEDDPSVXYFSWHRYEHGRFWPFLRESDADAVGRGOGLGFTVNLPWN QVGMGNA 

t • •» *X. !»• • I* * t ** • 5 * 

E Y1»TA FRTWMP I AHEF5PDVVLVSAGFDAVEGH1*SPIjGGYSWARCFGHIjTR0IjMTX*AG 
EYLTAFRTWMP3AHEFSPDV\n.VSAGFDAVEGHl,SPW^^ 

EY1*AAFRTWMP I AS EFAPDWLVSSGFDAVEGHPTFLGGYNLS ARCFGYLTKOLMGI^O 
EYX*AAFR3 WMP1 AREFS PDIjVLV£AGFDAAEGHPAPl»GGYHVSAJtCFGYMTOQU4NlAAG 
SYEA3 FKP VMSKVWTMFOPS A WXQCGSDSLSGD- -R1X5CFNLT3 KGHAXCVEFVKSFm. 
SYGQ3FRPI3 SRVMEMygPSAVVLQCGADSLSGD- - RlX5C^m.TVKGHAKC7VEVVK.TFHXt 
EYXHLFQPV3NQVVDFYQPTC3VLQCGADSLGCD- -RLGCFNLS3RGHGECVEYVXSFNX 
KYYQ3CESVX»KEVYOAFKPXAWl>QLGADTIAGD«" - PMCSFNMTPVGXGKCLKYILOWOIj 
DYVAAFLKLLLPLAFEFDPELVXVSAGFDSA3 GD- - P E GQMQ A TP E C F AHLTQLLQVTAG 
fit I * til. * *l i s •• • * 

GRWI*ALEGGHDLTA3CDASEACVSAl>LSVEliO PLDEA VLQQKPN3 NAVATI»EXVI 

GRWLALEG6HDLTA3 CDAS EACVS ALLS VXlrQ PLDEAVLQQXPN3 NAVATLEXVX 

GR3 VXALEGGHDLTA} CDASEACVSALLGNELD PLPEKVLQORPNANAVRSMEKVM 

G A WLALEG GHDLTA 3 CD A S EACV AALLGNRVD • - - - FLSEEGWKOKPQP - — ------ 

PMLMLG - GGGYT I RHVARCRTYETAVALDTE1 PNEL - P YHDYFE YFGPDFKI*HI SPSM-M 
P1.1MLG- GGGYTI RNV ARCWTYETAVAI*DCE 3 PKEI* - P YNDYFE YF GPDFK LH I SPSH-M 
PI,I,V1^-GGGYTVRNVARCWTYETSI*LVEEA3 SEEL-PYSBYFEYFAPDFTLHPDVSTRI 
AT1.3 LG - GGGYNLANTARCWTYLTGV3 LGKTLS S E I - FDHEFFTAYGPDYVXE1TFSC-R 
GRVCAVLEGGYHLESlAESVCMTVQTLl^DPAPPliSGPHAPCORCEGSAliESIOSARAAQ 

E3 0SRHWSCVQXFAAGLGRSI*REAQAGETEEAETVSAMAIjLSVGAEOAQAJkAAREHSPRP . 

E3 osrhmsc^okeaaglgrslreaoageteeaetvsamaixsvgaxoaoaaaarehsprp . 

E3 HSRYWRCL0RTTSTAGRS1*3 EAQTCENEEAETVTAMASLSVGVKPAEK RP 

. - ---OCHPLSGGRDPGAQ-- 

TNQNTNEYLEKI RQRLFEKLRMLPHAPGVQHQA3 PED AI PEES GDEDEDDPDKRI S I CSB 
TNONTPEYMERI R QRLF EHLRMLP HAP GV QWQ A I PEDAVHEDSGDEDGEDPDKRI S IRAS 

ENONSROYLDQI LQTI FENLKMLNHAFSVQI HDVF ADLLTYDRTDB - - — - - - 

PDRNEPHRIOQIluKyiRGNUUIVV r 

APHWSLOOQDVTAVPMSPSSHSPEGRPPP2*liPGGPVCKAAASAPSSLlJDOPClX:PAPSV . 



AEEPMEQEPAIt- 
AEEPMEQEPAIi- 
DEEPMEEEPP1*- 



DXR3 ACEEEFSDSEEEGEGGRKKS SNFRR- AKRVRTEDEREKDPEERREVTEEER^KB- - 
DKRI ACDEEFSDSEDEGEGGRRNVADHRKGARRAR3 EEDKXETEDXXTDVXEEDXSXDHS 
• ADAEERGP - EENYSRPEAPNEFYDGDHDND* - 
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HDAC8 

HDAC5 RTAVA1/TTPD1TLV1,PPDV1 OQEASALREETEAWXRPHESLAJlEEALtAl^GKLLyLLDGM 



HDAC5 

HDAC6 

HDAC4 

HDAC7 -? 

HDAC1 - EXPEARGVHXEVKXaA - 



HDAC2 GEXTDTKGTRSEQI-SNP - 

HDAC3 XESDVEJ - i . 

HDAC8 1 

. HDAC9 U)GQVNSG2AATPASAAAATlJyVAVlU*GLSHGAQRI,L^ 

H3&AC5 . 

HDAC6 — 

HDAC4 - 

HDAC7 

HDAC1 

KDAC2 

HDAC3 • 

HDAC8 . — 

HDAC9 IRGIUyUlALSMFHVSTPl^VOTGGFl.SCIl*GLV^ 

HDAC5 • 

HDAC6 - — - — . — — 

HDAC4 

HDAC7 — 

HDACl 

HDAC2 — - • „ 

HDAC3 

HDAC8 — — mtm 

HDAC9 A LLAAMLR GUV GGRV1*ALL»EEKSTP01»AG 3 LARVLNGEAPPSLGPSSVASPEDVQAUm, 



HDAC5 

HDAC6 

HDAC4 - 

HDAC7 

HDACl . — " ~ 

HDAC2 • • — 

HDAC3 

HDAC8 

HDAC9 PG0LEPQWKM1^CHPH1,VA 
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Figure 8 
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Figure 9. 
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Sequence. Zx HDACScac&'lyticdomain 329 a© 

Sequence 3 z *HDAC6cetaiyticdomsinl 302 aa* 
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Sequence 5: :*HDJtC7cat*a'iVt-:icdoms**in -33*4*. aa* 

Sequence frr *HDAC9cojt#iete-peptide 673"* aa 
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KDAC4 ca t aXyVS cd cmaj n 
MDACScaf aay Vi cdomain 
KDAOcalaXytS caqmain 
KDA C6 fc a C aiy tl t a qma i n2 
HDAC6caV aiy i S c* domain! 
KDA CS camyXe 1 cp. ep fi de 



HDA C4 ca 1 aJLy CS cd amain 
KDACScataJLyt i c a amain 
KDAC7cataayVi c domain 
HDAC6daValy ti cdoraainS 
KDA C6 c a t&Xy t i cd omai nl 
KDAC3 cample 1 epep tl de 

KDA C 4 ca 1 aly 13 c d amain 
KDA CSca t eO.y-1 S cd amain 
KDA C7 ca laly li c a amain 
HD A C 6 cai.^iy 1 led amain? 
KDA C6ca t&ay ;*S c'aomai nj 
KDAC9 caraplelepep ti de 

KDA 04 ca t oly 1 5 ca onia in 
KDA C5 ca taiy 11 c d amain 
HDAC7ea't % aayli c a i amain 
KDA «C6ca t^aZlyt^ cdomainS 
' KDA C6ca 1 aZLy lie a oma i nl 
)0)AC9cangpa c t. ep epil de 

HDATffca l -^yl^«aania4n 
1 [DA cic-a4'tfLx tied amain 
*mAX7f?*\ : £iyt? cacmain 
OTA'Cic«**.aiyV5 cdtm>ain2 
WA^ca^ttoyi'^cdamaina 
HDA CScaropifcl^p^F Vide * 

MDACJ/cjat-aayri^a.qwia.in 
MpACacat^yVScdamain 
KDA C7 Cja t •aJLyt'S ; -c B I oma iTi 
3mAC6.oat-«dytica;cp^ih5 
MDA*C«fedt-atly«-c.damai*<3 
' HDAC9^a^iVl*ji;ep^de 

KDAC4'63t*>alyt'i cd.amain 
:KDA«jc;a t alyVi c domain 
HDAX7.ca;i*ady««d*amain 
KDA<#.ca**0.y t3 cd amain? 
' KDA'C6ca 1-alyti-cB oma J nl 
in>AC9 C ampi e\ *J>*pVi 4* 



)n>A-Cjl ca* <Oyli cd amain 
MDACSca t 'eiyti cd oma i n 
in)ACToa l^iyl i c d oma In 
KDAC^cal *lyt3 -cd oma ih2 
KDA C6 ca t aOy Vi cd cmairtl 
10)AC9cavnprct ej>cpti de 



] 3eHA GP.3 .fl S.fH SRLQETGL 



rrrTKPRTTTCl;yyDTtMLKHnCT<:CSS5S] 

cv>ryDirHUoiboicuiiTHvd riEHAUB3 g sTwsRLflETGt 

7 Cll;3 YD 5VMEKH Q G S C GDI 1 SJUpjEHA GR 3 g 5TW SPJU0 EK GE; 

GLVYD0)D^iWMa7L>7DS--> 

'VliDEQLIIETMCLVPDS- 

MG TAJLVTnCEDMl ATIOJLTOBPEC*^ 



I I ECPEREHAIKEQL3yEGE 
:) ERPERLTAAXDHl^OBGL 



P UK CE C3 P CKKA T tEEL^TNTHSEAMl ~ET¥ GT^UTOQKlJD 5KKLE« StA 

L*KC£P3RGPo6lTEDElDlVHSrYMT-lXYGTSPLKRQ 

PS 0 CE CLP GRXA SUCKED SVMSE3WV YCTHPESJUixiJ>*GKLA CPA 

A UP CETL I PRTA T EAELE'T C H SLAE YV GMLPA TEJ<KKTKELWRE ~ 

U>RWSFQAJ^AXXCEl^VrtSliEyifc^ — 

E OP <0?RL;SABEA SXEEtGLyHSPEYV SEVPXT jflfVL^GKIXLIJAL 



s-vrvRLPCGGVGVDSDTivii c: msATBJUuuAVGTyyiaryrw^ 



-SS1TTDS3YI 



©P)4TEHUP.C.GGVGVDTDT.rH71 C 3<SSlUlAJTOAA^5y7DliArWASRE33CH 



M?SyTDEAJ^ASHE13ai 
C^S TTA"CA QEA T GAAX3^yEAVE5 CEVl^i 

ADTYDSVyX II >nsy5CAt«scsyi^yDA^^iiPH 

SGOEDAlyr ri ? S TfM CARLAA GA^^pl^DAVl: T jGAV (JH 



GrA vvia? r GftnijEE s tits 
crAiaiaiP.c oti xesiai 

GrAVVPPPG OL J>KStAS 
CAAVVPPP.C Oi X()1»AA I 
GHA3 1 WPP G Oi ■ QHSUM1 
GEAEVRPPG Oi ;0HAAAj 



jcynisvAVAAjaijQR — i^ysicyiia^V 
|<irr«svAi r^ia^fiKr^r^cKip^^m 

I CMD rHVAVAARYAlj QK HRIRPyL^T p <V 

J CVrKinMiAAJWi^pK---.-M GliKRi l^V" jj >y_ 



Jgkgtc cArysDP^VL^siaogtDp.t; 

J G31 GT Q DArVKDP^Vl.*y-3 SlJOlST>ll, 
lGMGTQpjry ODPSVEV a SEMRHDPGH 
|GPCT0K^ffOTDP5viw^lMraMGT 
| G 0 G3* Cr.Tra QDP SyiJXT S 1 HRYX QGft 
jGqG3C3qjTODP.Svi^SWKR^^ 

. » V*.* *4 * x ■* * •* • *.* v* • r» 



GS- - GAPDEVGTGP GV0 
hP BS"- 1 GATEEV GG GP GV0 
GS - - GAVDEV GA" G S GE B* 
GPEGASSP^ CRAAGTC 
Kt^S3WS?*GFG9GQ£r 
iVGBGOGEiG: 



EKVltMAT -7 G GEpP PM Gl>A£^QjAAI^RiyVHP^SETAPl)VyEyCT GI WW' 
-YlTVHVA^l GGvi)P P 3^ CDVEYi.^AJ*R7^WCP 1AKEPSP DVVXV SAGI £ &VE' 



r7VHVA>IH G- - ^PFM GDAJ5Y12AAVHPXVEP lA>ii^&WV ^ gS |) £AJK 

•yiiKvprane--VGMPJDAJ)y3JiJa^fm ? ^0* r 

rwhj^g^—yGHGi^TOAAjiMf^^ 



GKTTPE^GYl^SAJlcrG^arrKOl^Gl^CGWrV^ 
GKLSPE-G GY SV TAP CTGML TXiA G GRVVliAlip C t 

GKPAPL-G GYHV SAK CT-GXM T p Ql^OUiA" K GAWllWUE G C 
CD--.PLCGc;c^SPEGmHL^ , Hin^GlASGPJ^ 
CD-Vp* GEMH3VTPA GFA QL TMUL^ ICLK C Gia^hi^l^.Gt 
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. cat tly tied omain 

JHDAOc a t aly t S c d ama in 
HDAOca t aly 15 cd amain 
.KDAC6cat«ayticdamain2 
ja>iC6cataXyticdomaifrt 
MOftC^c ample ttpejiti de. 



nRSEEGDPPPEETE.PRPPLS GALAS 3 3 WHIUWVK5tJ?VMKV. 

TV Q ILL. GDPAPPLS CPMAP COR C EG SALES 1 0 SARAA QJU>HtfK£l. Q 0 QEV 



HDAC4 cb t -aly t J cdamain 
HDAC5cataAyti cdamain 
.KDAC7caLaXyti cdomain 
){PAC6cot aJ.y t 1 cdamain? 
ja>A O* e a 1 aly 1 1 edema i .nl 
JIDAC9 conga e t-epepti de 
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. t a vpmsps snsrx crpppelp c cfvcxaaa sap s SEED 0 P OL<PAP SVRT 



JEDAC4 cat *ayli cdamain 
HDACSca t aly t i cdamain 
jn>AC7cat«lyticdcmain 
HDAC6cataly tied amain* 
•>n>AC6c^EaAyticdaiwainl 
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-MCKVTSASTCIXS TP.G Q THSE-TAVVAE T0I> QP SEAAT G GAT 



-AVAL t TPD 3 TLVLRPDV 3 U Q CA SAEREETEA VARP ME SXAKEXAL TALCS 



jfDACJcatalytl cdamain 
KDACdcat.a3.yt J c domain 
-KDAXT7 cat aXy t i cdamain 
3(DAC6cat«Xyti cecmain2 
.XDAC6 cataiy t dfed omai nl* 
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EA Q J 3 SXAA 3 G GAME GQ TT SXXAVGGA TPDQT-TSEETV G GAIL- 



EEYLED GML3> G QVR S ElAATPASAAAATEpVAVRRGt SM GA QRLX/CVAEC 



• «DA C4 crat aftyt Sod amain 
.HDAC!> ca t aiy t i -cd onaln' 
.JWA CD tra\ aLEy ti c d amain * 
.«DAC6catiCLy-tJcdojnaina- 
rttDACfccatfeiy ti cdomaim 
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*-.HDA-C6cataiLyt JcdomalJia 
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-VEPLAY GT^PDEVEVAL BP CM GE^) GPHAAEEAAMER C12A GCRVEAEEEBI 



. HPAC4c«t oEyticdamain- 
t«DACScataay-tiCdciaaaw 
.HDACJ ca faly tied ama IrT 
sHDAC6cat«ayticdanialnS» 
.WDACfcc^E*"lartiCdamaiia 
.KDAC9cavnpactcpcptldc 



STPVLA«3EAJr^»GEAPPSLGPSSVASPET>VQAlJm^ 



rHX>A-C4 ca t*Ay tl c domain' 
, WDAC5 catalytic doma in- 
HDA^? catalytic domain 
.HDACfccataiyticdcmainS 
/HDAC6catalytiCdamainl 
• MP AC 9 c ompl trt- cp ep t 1 -de" 
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Figure 10. 
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KDAC3 era t&Xy 13 cd nmajji 
. >0>A C2 crai ^Xy t 3 Td oma i n 

)CDAC3oatalyi S^doma'in 
• MDA C8 ca t aiy t. i t d oma in 

HDA C9comj>2-c't epep* i de 



KDAC3 calaJLyt. 3 cdomain 
JtPAC?cataly ti r domain 
HDAC3caiaXyti cdomain 
MDA C8 cat aly 1 3 cd omai n 
HDACScompXe I epept i<de 



MDAC3 cat aJLyt i cdomain 
ItDACScatalyli cdomain 
-HDACScafaAyt i-c domain 
MDACBcataXy I i-c domain 
*10>AC9con«>Xe't-cpe.pti.de 



KDAC3c^laayt.^c domain 
HDA C?ca t «ily- Lied oma in 
1TOAC3 catalytic domain 
HDACBca l^aJLyt i cdomain 
• MDA CSc omplet ep ep t i Ad e 



• KDAC3 cat alyti c domain 
tCDA C2cat aJiy ti«cd oma i n 
-IfPA C3 ca l^aXy 1 i-c d omain 
-MDA CBcateOyti cdomain 
/J0>AC9comj>3Lx**. epejrti dc 

HDAOcaraXy Vi c domain 
>CDAC2cal*ay t/i Cdomain 
MRAOcat^aJlyt^ e domain 
MDAtTBcat iOyti c domain 
*MDAC9comp3 ctyj.tplo'dc 



cyyypi.DVGH yyycootPHK i 

— <"yyyi>ci>3 cm yyvGQGHPjix ? 

yrYDPDvgj? rm'G**>£PM)<|' 

--wyjyspryvi mcdslax — a 

>l GTAlOTKQtt) TA TRLUKDDPE CE3 EH 



W3 JIM THH1XLH37 GLYJIKI1E1YRPMKAHAE3EM 

»: CRASMVHSI-3 EA XA3LKK OMRTVKPKtf A"SHE3EM 
1 ^^AAL^IOJlpIlGl^gRCl^SAJaJlSrE^L 



5 I 



!>OTiSI>DYlKrUlS3JU>DJWSEySK0*igWW 
TKYMSDEyJKTLPSJJlPI) 
fKDJSEDYII 

ATr«1 PAYLClCLgKVS^ECDDDMPMlE-^ 
GLVHSPEYV£LVraTQyX.GjaXLQftLS lV0ri>A2{fhHP5T^. 



DJCJSrySKCDOUFlTVG 2 ) CP ATP GI5XF- C QL5T6GSVA"GAV>CL. 

DrxgKvsPTiiMOBrTKSDiAnTvc 7 KPvr^^XErcswr^eASEtjEATQi: 



-^ca^CPATEGirPXAAAJSGATItAAOCX 

--rncAJu^BACL^imxnv-. 



JTR Q 0 TDMAVNHA G6- JO LAXKYEA 



lmggTDOAVlTWA CG- AJWSEA^CT [^VHD JVIAJ LCtXKSH — g*VXY3 

xyea J gt. ^yvwDjvajAa taucm - ~ gRvty-i 

IKTEAS GT ryVHDTVJGJfcEIiKYH- 

xdeas cr cyj-WDAVLGi uujoixrv 

TO)CQRAAAJ{Cr ^VTlllfVAIAAAJtAJKOKHGLHRI^ 



innxj CD1A3 JTWA G G- 
3 DGM CKVAI JWS G6- 
L I* GA V g H GLA3.VRF r 
. ; : 



_JD lAKKTEAS 
7H> LAXKDEA'a 




CVEEATY DPVM IVST-HX* G 

CVEEATyTTDKVM TVSrMK?« 

GV QEATYET DKVM TV SrWKTGH 

GVEDATSW SKVMTVSLUKTSP- 
G3 gyiTEDDP SVrTr^HKRT01CmrWp|^KESI) 



yr r »GT<n> 

Gr IS >G1:G£ 



GTCD U U>3«ACKGKyXAV3?XPIJUrC-.JP&3BS' 
JCy^CTCPjL U>3«A*CKGX3QLAVMI3>M CDG~IDl>3SS ' 

* nCV.GAESG^Q^CL3iV2>lja>9-2I>X>||S 

* iPVGtcKG^yysv3ivMgi/G»afli>jEbc. 

4>^ V CTR G g GF TVHUWJ) OV GH CHAD 

YEA3 rK>nwsK\7m>G:gpsAVvi:gcGSPsksGT>i\t6crm:^ 

VGgjFKPJ 3 S1<VMEOT9PSAVV1.^V4^5E5G1>^ 

yKMLrgpva>igvvpragp.TcmgoGAi)5tGCP 

yyg3 CE SVUXEVir pADtPKAWt gi- GAP 5 lAt>T>rM CSTHHT PV G3 «KC13OTXigttJQ-XjAT 
YVAATXMUiLPlJUrCTDPta:Vl.V^*GrPSAa*PPEGC^^^ OA TPECrAJD^^giXgyijAG GJC9 » 

ITMLGGGX X T CHT3TE3 AVAU> ~ w ^ 

lKLGGGt S riIUIVAJl<WTSETAVAia>- 



XVECGGC S lyTFUTVAR CW 7 YETSIX^VE 

3L*ECCCC X TllwiTA^tMT3EI£VJSL 



CAVLJC GtjgaXSlAXSV OiTV 0T1XGIXPAPPLS GPHAT CgR<XGSJaiPS3 gSATtAAO^Plf 



MPACaTcataayl^cdomatn 
OtDACScat aJy t J : «? domain 
KDAOcal aiyl-i cd oma in 
KDACBcat^ayi^-cdomain 
tMDA CSc ompl 1 1. cp.e^Xide 



WSL gg^g»VTAyPMpSS^SPEGRPPPIOJ>GGPVCl<A^ 



'HDAdcal^Oyl^ cdomain 
>OAiC2cal aiy lie" domain 
in>A<3c at Alyt-J c d amain 
*>OiA<8cat aly Vi cdomadn 
>OA C9 c ompie t <>.cp d c 



VAL7 TPP3 7EV3JPPPV.3 00l^SAl>TOETEA>iA^HESVA3^EJ^^A^^ 



KDAC3 ca t aly tiJ cdomain 
KDA C9ca 1 &lyii : cd omain 
KDACScataly t-i cdomain 
"HDA-CBcat ajy t : i cd oma in 
. JOAC5 camp! trt cjj.epti dc 



gyMSG3JUTPASAAAATLPVAVRRGESilGAgmX^AXb^I.I>WPPlJ^ 



JIDAC3 cdl&lyt'i Cd amain 
>n>AC2cat«4yt^ cdomain 
MDAC3cat aly ti cdomain 
KDA-Cficatalyt : i cdomain 
30>AC9conQi3 c t tpepti dr 



KEAAAXSMDfVSTPLWMI CGITLSC3 L GLVLTLA y^POPPLVLVAX CP GMGLgGPHnJOi 
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cat Aly tic domain 
HDAC2catal.y tied amain 
YWA C3 c a t Sly 1 3 rdomain 
HPACGcat aly t-icdomain 
HDA CS r orapl t lepepti dc 



HPAOcalalytiedainain 
MPA C7 c a 1 £ly t i cd amain 
HPAC3 cat alyt 1 cd amain 
MPA C B c a 1 i£l y t i c d ama i n 
MPAC9comp5«tcpcptidc 



W>ACJcat«lytlcdamaln 
KB* C5 ca t aayt i cdamai n 
KP A C3 c a t «£ly t i c d ama 1 n 
KD A C6 c a t aly I i ed ama 1-n 
ICDA C9 compJ e t epep I J-de 



trpowxHrgcHPHtyji 
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Figure 11A 

HDAC9vl MGTALVYHEDMTATRLLWDDPECEIERPERLTAALDRLRQRGLEQRCLRLSAREASEEEL 

HDAC9v2 MGTALVYHEDMTATRLLWDDPECEIERPERLTAALDRLRQRGLEQRCLRLSAREASEEEL 

HDAC9v3 MGTALVYHEDMTATRLLWDDPECEIERPERLTAALDRLRQRGLEQRCLRLSAREASEEEI, 
*********************************** ************************* 

HDAC9vl GLVHSPEYVSLVRETQVIiGKEELQALSGQFDAIYFHPSTFHCARIAAGAGIiQLVDAVLTG 

HDAC9v2 GLVHSPEYVSLVRETQVLGKEELQALSGQFDAIYFHPSTFHCARLAAGAGLQLVDAVLTG 

HDAC9v3 GX*VHSPEYVSLVRETQVLGKEELQALSGQFDAIYFHPSTFHCARLAAGAGLQLVDAVLTG 
* ********************* * ************************************* 

HDAC9vl AVQNGLALVRPPGHHGQRAAANGFCVFNNVAIAAAHAKQ^ 

HDAC9v2 AVQNGLALVRPPGHHGQRAAANGFCVFNNVAIAAAHAKQKHGLHRILVVDWDVRHGQGIQ 

HDAC9v3 AVQNGI^VRPPGHHGQRAAANGFCVFNWAIAAAHAKQKHGLHRILVVDWDVHHGQGIQ 
************************************************************ 

HDAC9vl YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGIA3FTVmiPWNQVGMGl«VDYYA 

HDAC9v2 YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGLGFTVNLPWNQVGMGNADYVA 

HDAC9v3 YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGLGFTVNLPWN 

************************************************* 

HDAC9vl AFLHLLLPIiAFEFDPELVLVSAGFDSAIGDPEGQMQATPECFAHLTQLLQVLAGGRVCAV 

HDAC9v2 AFLHLLLPLAFEFDPELVLVSAGFDSAIGDPEGQMQATPECFAHLTQLLQVIAGGRVCAV 

HDAC9v3 QFDPELVLVSAGFDSAIGDPEGQMQATPECFAHLTQLLQVLAGGKVCAV 

• ************************************************* 

HDAC9vl LEGG YHLES LAES VCMTVQTLLGD PAP PLSG PMAPCQRCEGS ALES IQSARAAQAPHWKS 

HDAC9v2 LEGGYHLESLAESVCMTVQTIiLGDPAPPLSGPMAPCQRCEGSALESIQSARAAQAPHWKS 

HDAC9v3 LEGGYHLESL7VESV04TVQTLLGDPAPPLSGPM7^CQRCEGSALESIQSARAAQAPHWKS 
************* *********************************************** 

HDAC9vl LQQQDVTAVPMS PS S HS PEGRP PPLLPGGPVCKAAAS APS S LLDQPCLC PAPS VRTAVAL 

HDAC9v2 LQQQDVTAVPMSPSSHSPEGRPPPLLPGGPVCKAAASAPSSLLDQPCLC^APSVRTAVAL 

HDAC9V3 LQQQDVTAVPMS PSSHSPEGRPPPLLPGGPVCKAAASAPSSLLDQPCLCPAPSVRTAVAL 
******************************************* ***************** 

HDAC9vl TTPDITLVLPPDVIQQEA 

HDAC9v2 TTPDITLVLPPDVIQQEASALREETEAWARPHESIJUIEEALTAI*GKLLYLLDGMLDGQVN 

HDAC9 v3 TTPDI TLVLPPDVIQQEASALREETEAWARPHESLAREEALTALGKI1LYI1LDGMLDGQVN 
****************** 

HDAC9vl 

HDAC9v2 SG I AAT PAS AAAATLDVAVRRGLSHGAQRLLCVALGQLDRP P DLAH DGRSLWLN I RGKEA 
HDAC9v3- SGIAATPASAAAATLDVAVRRGLSHGAQSWGVGEGLLEAMPGGSPAQRLSSHSTPAHGPV 

HDAC9vl : CI LGL VLPLA YG FQP DL VLVALG PGHGLQG PHAALLAAM 

HDAC9v2 AAI*SMFHVSTPLPVMTGGFLSCILGL\TbPIiAYGFQPDIiVLVALGPGHGLQGPHAALI^ 
HDAC9v3 NALPPLPLRFGLRRMTGGFLSCILGL^PLA^ 

*************************** # * 

HDAC9vl LRGIAGGRVI^LEENSTPQLAGILARVLNGEAPPSLGLSSVAS^^ 

HDAC9V2 LRGLAGGRVLALLEEVSWAGWR — CCGVGRGKGP — VTASVFAPGPELHTPASRDPGPGA 

HDAC9v3 FGGWQG AESWPSWR RGRPGPYVPERAAGASVEDVAVPSSPGGLKSA 

• *** *•* 

HDAC9vl QWKMLQCRPHLVA 

HDAC9V2 EWRGTS 

HDAC9v3 K 
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Figure 13 
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Figure 14 



SEQ ID NO: 7 

>HDAC9v2 DNA sequence 

1 ATGGGGACCGCGCTTGTGTACCATGAGGACATGACGGCCACCCGGCTGCTCTGGGACGAC 
61 CCCGAGTGCGAGATCGAGCGTCCTGAGCGCCTGACCGCAGCCCTGGATCGCCTGCGGCAG 
121 CGCGGCCTGGAACAGAGGTGTCTGCGGTTGTCAGCCCGCG AGGCCTCGGAAGAGGAGCTG 
181 GGCCTGGTGCACAGCCCAGAGTATGTATCCCTGGTCAGGGAGACCCAGGTCCTAGGCAAG 
241 GAGGAGCTGCAGGCGCTGTCCGGACAGTTCGACGCCATCTACTTCCACCCGAGTACCTTT 
301 CACTGCGCGCGGCTGGCCGCAGGGGCTGGACTGCAGCTGGTGGACGCTGTGCTCACTGGA 
361 GCTGTGCAAAATGGGCTTGCCCTGGTGAGGCCTCCCGGGCACCATGGCCAGAGGGCGGCT 
421 GCCAACGGGTTCTGTGTGTTCAACAACGTGGCCATAGCAGCTGCACATGCCAAGCAG7VAA 
481 CACGGGCTACACAGGATCCTCGTCGTGGACTGGGATGTGCACCATGGCCAGGGGATCCAG 
541 TATCTCTTTGAGGATGACCCCAGCGTCCTTTACTTCTCCTGGCACCGCTATGAGCATGGG 
60 1 CGCTTCTGGCCTTTCCTGCGAGAGTCAGATGCAGACGGAGTGGGGCGGGGACAGGGCCTC 
661 GGCTTCACTGTCAACCTGCCCTGGAACCAGGTTGGGATGGGAAACGCTGACTACGTGGCT 
721 GCCTTCCTGCACCTGCTGCTCCCACTGGCCTTTGAGTTTGACCCTGAGCTGGTGCTGGTC 
781 TCGGCAGGATTTGACTCAGCCATCGGGGACCCTGAGGGGCAAATGCAGGCCACGCCAGAG 
841 TGCTTCGCCCACCTCACACAGCTGCTGCAGGTGCTGGCCGGCGGCCGGGTCTGTGCCGTG 
90 1 CTGGAGGGCGGCTACCACCTGGAGTCACTGGCGGAGTCAGTGTGCATGACAGTACAGACG 
961 CTGCTGGGTGACCCGGCCCCACCCCTGTCAGGGCCAATGGCGCCATGTCAGAGGTGCGAG 
1021 GGGAGTGCCCTAGAGTCCATCCAGAGTGCCCGTGCTGCCCAGGCCCCGCACTGGAAGAGC 
1081 CTCCAGCAGCAAGATGTGACCGCTGTGCCGATGAGCCCCAGCAGCCACTCCCCAGAGGGG 
1141 AGGCCTCCACCTCTGCTGCCTGGGGGTCCAGTGTGTAAGGCAGCTGCATCTGCACCGAGC 
1201 TCCCTCCTGGACCAGCCGTGCCTCTGCCCCGCACCCTCTGTCCGCACCGCTGTTGCCCTG 
1261 ACAACGCCGGATATCACATTGGTTCTGCCCCCTGACGTCATCCAACAGGAAGCGTCAGCC 
1321 CTGAGGGAGGAGACAGAAGCCTGGGCCAGGCCACACGAGTCCCTGGCCCGGGAGGAGGCC 
1381 CTCACTGCACTTGGGAAGCTCCTGTACCTCTTAGATGGGATGCTGGATGGGCAGGTGAAC 
1441 AGTGGTATAGCAGCCACTCCAGCCTCTGCTGCAGCAGCCACCCTGGATGT GGCTGTTCGG 
1501 AGAGGCGTGTCCCACGGAGCCCAGAGGCTGCTGTGCGTGGCCCTGGGACAGCTGGACCGG 
1561 CCTCCAGACCTCGCCCATGACGGGAGGAGTCTGTGGCTGAACATCAGGGGCAAGGAGGCG 
1621 GCTGCCCTATCCATGTTCCATGTCTCCACGCCACTGCCAGTGATGACCGGTGGTTTCGTG 
1 68 1 AGCTGCATCTTGGGCTTGGTGCTGCCCCTGGCCTATGGCTTCCAGCCTGACCTGGTGCTG 
1741 GTGGCGCTGGGGCCTGGCCATGGCCTGCAGGGCCCCCACGCTGCACTCCTGGCTGCAATG 
1801 CTTCGGGGGCTGGCAGGGGGCCGAGTCCTGGCCCTCCTGGAGGAGGTAAGCTGGGCAGGG 
1861 TGGAGGTGCTGCGGGGTGGGACGAGGGGAAGGACCAGTGACTGCTTCCGTCTTCGCCCCT 
1921 GGTCCAGAACTCCACACCCCAGCTAGCAGGGATCCTGGCCCGGGTGCTGAATGGAGAGGC 
1981 ACCTCCTAGCCTAGGCCTTTCCTCTGTGGCCTCCCCAGAGGACGTCCAGGCCCTGATGTA 
204 1 CCTGAGAGGGCAGCTGGAGCCTCAGTGGAAGATGTTGCAGTGCCATCCTCACCTGGTGGC 
2101 TTGA 

SEQ ID NO: 5 * 

>HDAC9v2 peptide sequence 

1 MGTALVYHEDMTATRLLWDDPECEIERPEIUiTA^DRLRQRGLEQRCLRLST^REASEEEL 
61 GLVH S PEYVSLVRETQVLGKEELQALSGQFD^X YFH P ST FHCARLAAGAGLQLy DAVLTG 
121 AVQNGLALVRPPGHHGQRAAANGFCVFNNVAIAAAHAKQKHGLHRILWDWDVHHGQG 
181 YLFEDDPSVLY FSWRRYEHGRFW P FLRES DADAVGRGQGLG FT VNLPWNQVGMGNADYVA 
241 AFLHLLLPLAFEFDPELVLVSAGFDSAIGDPEGQMQATPECFAHLTQLLQVLAGGRVCAV 
301 LEGGYHLES1AESVCMTVQTLLGDPAPPLSGPMAPCQRCEGSALESIQSARAAQAPHWKS 
'361 LQQQDVTAVPMSPSSHSPEGRPPPLLPGGPVCKAAASAPSSLLDQPCLCPAPSVRTAV7VL 
421 TTPDITLVLPPDVIQQEASALREETEAWARPHESLAREEALTAIX3KLLYLLDGMLDGQVN 
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481 SGI AAT PAS AAAATLDVAVRRGLS HGAQRLLCVALGQLDRP P DLAH DGRS L WLN I RGKEA 

541 AM,SMFHVSTPLPVMTGGFLSCILGLVLPLAYGFQPDLVLVALGPGHGLQGPHAALLAAM 

601 LRGl^GGRVLALLEEVSWAGWRCCGVGRGKGPVTASVFAPGPELHTPASRDPGPGAEWRG 

661 TS 

SEQ ID NO: 8 

>HDAC9v3 DNA sequence 

1 ATGGGGACCGCGCTTGTGTACCATGAGGACATGACGGCCACCCGGCTGCTCTGGGACGAC 
61 CCCGAGTGCGAGATCGAGCGTCCTGAGCGCCTGACCGCAGCCCTGGATCGCCTGCGGCAG 
121 CGCGGCCTGGAACAGAGGTGTCTGCGGTTGTCAGCCCGCGAGGCCTCGGAAGAGGAGCTO 
181 GGCCTGGTGCACAGCCCAGAGTATGTATCCCTGGTCAGGGAGACCCAGGTCCTAGGCAAG 
241 GAGGAGCTGCAGGCGCTGTCCGGAC^GTTCGACGCCATCTACTTCCACCCXSAGTACCTTT 
301 CACTGCGCGCGGCTGGCCGCAGGGGCTGGACTGC^^ 

361 GCTGTGCAAAATGGGCTTGCCCTGGTGAGGCCTCCCGGGCACCATGGCCAGAGGGCGGCT 
421 GCCAACGGGTTCTGCGTGTTCAACAACGTGGCCATAGCAGCTGCACATGCCAAGC^ 
481 CACGGGCTACACAGGATCCTCGTCGTGGACTGGGATGTGCACCATGGCCAGGGGATCCZAG 
54 1 TATCTCTTTGAGGATGACCCCAGCX3TCCTTTACTTC 
601 CX3CTTCTGGCCTTTCCTGCGAGAGTGAGATGCAGACGO 
661 GGCTTCACTGTCAACCTC^CCTGGAACC^ 

72 1 GGATTTGACTCAGCCATCGGGGACCCTGAGGGGCAAATGCAGGCCACGCCAGAGTGCTTC 
781 GCCCACCTCACACAGCTGCTGCAGGTGCTGGCCGGCGGCCGGGTCTGTGCCGTGCTGGAG 
841 GGCGGCTACCACCTGGAGTCACTGGCGGAGTC^GTGTGC 

901 GGTGACCCGGCCCCACCCCTGTCAGGGCCAATGGCGCCATGTCAGAGGTGCGAGGGGAGT 
961 GCCCTAGAGTCCATCCAGAGTGCCCGTGCTCCCCAGGCCCCGCAC^^ 

1021 CAGCAAGATGTGACCGCTGTGCCGATGAGCCCCAGCAGCCACTCCCCAGAGGGGAGGCCT 

1081 CCACCTCTGCTGCCTGGGGGTCCAGTGTGTAAGGCAGCT 

1141 CTGGACCAGCCGTGCCTCTGCCCCXSCACCCrCTGTCCGCACCGCTGTTGCCCTGAO^CG 
1201 CCGGATATCACATTGGTTCTGCCCCCTGACGTCATCCAACAGGAAGCGTCAGCCCTGAGG 
1261 GAGGAGACAGAAGCCTGGGCCAGGCCACACGAGTCCCTGGCCCGGGAGGAGGCCCTCACT 
1321 GCACTTGGGAAGCTCCTGTACCTCTTAGATGGGATGCTGGATGGGCAGGTGAACAGTGGT 
1381 ATAGCAGCCACTCCAGCCTCTGCTGCAGCAGCCACCCTGGATGTGGCTGTTCGGA 
1441 CTGTCCCACGGAGCCCAGAGCTGGGGTGTGGGAGAAGGGCTGCTGGAGGCAATGCCAGGT 
1501 GGGTCTCCAGCACAGAGGCTCAGCAGTCACAGCACCCCTGCCCATGGCCCCGTGAATG 
1561 CTTCCACCTCTGCCTCTGCGGTTTGGGCTCAGGA 
1621 ATCTTGGGClTGGTGCTGCCCCnXSGCCTATC 

1681 CTGGGGCCTGGC CATGGCTGCAGGGCCCCCACGCTGCACTCCTGGCTGCAATGCTTCGGG 
1741 GGCTGGCAGGGGGCCGAGTCCTGGCCCTCCTGGAGGAGAGGACGTCCAGGCCCTTATGTA 
1801 CCTGAGAGGGCAGCTGGAGCCTCAGTGGAAGATGTTGCAGTGCCATCCTCACCTGGTGGC 
1861 TTGAAATCGGCCAAG 

SEQ ID NO: 6 

>HDAC9v3 peptide sequence 

1 MGTAliVYHEDNrTATRIiLWDDPECTIERPERIiTAAIiDRIiRQRGL^ 
61 GLVHSPEYVSLVRETQVLGKEEI^ALSGQFDA 
121 AVQNGLALVRPPGHHGQRAAANGFCVFNNVAIAAAHAK^ 

181 YLFEDDP SVLYF S WHRYEHGRFWP FliRE SDAD AVGRGQGI»GPTVNLP WNQ FD PELVLVS A 
241 GFDSAIGDPEGQMQATPECFAHLTQLLQVLAGGRVCAV^ 

3 01 GDPAPPDSGPMAPCQRCEGSALES IQS ARAAQAPHWKSLQQQDVTAVPMS PS SHS PEGRP 
361 PPLLPGGPVCKAAASAPSSLI^QPCLCPAPSVRTAVALTTPDITLVLPPDVIQQEAS 
421 EETEAWARPHESLARBEALTALGKIjLYLLDGMLDGQVNSG 

481 LSHGAQS WGVGEGIiIjEAMPGGS PAQRIiS SHSTPAHGPVNALPPLPLRFGIjRRMTGGFIiS C 
541 IIjGIAH^PIiAYGFQPDIiVIjVALGPGHGCRAP 

601 PERAAGAS vedvavps s pgglksak 
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hlstone deacetylase - related gene and protein 
Field of the Invention 

This invention relates to a histone deacetylase gene and gene product- In particular, the 
5 invention relates to a protein that is highly homologous to known yeast histone deacetylase 1 
(hdal) class II histone deacetylases (HDACs), nucleic acid molecules that encode such a protein, 
antibodies that recognize the protein, and methods for diagnosing conditions related to abnormal 
HDAC activity, including, for example, abnormal cell proliferation, cancer, atherosclerosis, 
inflammatory bowel disease, host inflammatory or immune response or psoriasis. 

10 

Background of the Invention 

Histone acetylation is a major regulatory mechanism. that modulates gene expression by 
altering the accessibility of transcription factors to DNA. Acetylation of histones is a reversible 

15 modification of the free Z-amino group of lysine that occurs during the assembly of nucleosomes 
and during DNA synthesis. Changes in histone acetylation levels also occur during 
transcriptional activation and silencing. Acetylation of histones is generally associated with 
transcriptional activity, whereas deacetylation is associated with transcriptional repression. 
Histone acetylation levels result from an equilibrium between competing histone acetylases and 

20 deacetylases (Emiliani, S., Fischle, W., Van Lindt, C, Al-Abed, Y., and Verdin, E., Proc Nat 
Acad. Sci., U. S. A., 95, 2795-2800 (1998). 

HDACs have been shown to play an important role in the regulation of transcription. 
HDACs function as components of complexes that are involved in transcriptional repression. 

25 This is mediated through interactions of HDACs with multi-protein complexes and requires 
deacetylase activity. HDAC complexes may contain the co-repressor mSin3 A (Kasten, MM., 
Dorland, S., Stillman, D.J. Mol Cell Biol 17, 4852-4858 (1997)) and mSin3A-associated 
proteins (Zhang, Y., Iratni, R., Erdjument-Bromage, H., Tempst, P., Reinberg, D. Cell 89, 357- 
364 (1997); Zhang, Y., Sun, Z.W., Lratni, R, Erdjument-Bromage, H., Tempst, P., Hampsey, M., 

30 Reinberg, D. Mol Cell 1, 1021-1031(1998)) silencing mediators NcoR (Nagy, L., H.- Y. Kao, 
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D. Chakravarti, R J. Lin, C. A. Hassig, D. E. Ayer, S. L. Schreiber, and R M. Evans (1997) Cell 
89, 373-380 and SMRT (Alland, L. et al., Nature 387:49-55 (1997); Heinzel, T. et al. Nature 
387:43-8 (1997)), transcriptional repressors Rb (Brownell, J. E., Zhou, J., Ranalli, T., Kobayashi, 
R, Edmondson, D. G., Roth, S.Y., and Allis, C. D. (1996) Cell 84, 843-85 1), Rb-like proteins 
5 pl07 (Ferreira, R, Magnaghi-Jaulin, L., Robin, P., Harel-Bellan, A., Trouche, D. (1998) Proa 
Natl. Acad. Sci. USA 95, 10493-10498) and pl30 (Stiegler, P., De Luca, A Bagella, L., 
Giordano, A. (1998) Cancer Res. 389, 187-190), Rb-associated proteins (Nicolas, E., Morales, 
V., Magnaghi-Jaulin, L., Harel-Bellan, A, Richard-Foy, H., Trouche, D. (2000) J. Biol. Chem. 
275, 9797-9804, Lai, A, Lee, J.M., Yang, W.M., DeCaprio, LA, Kaelin, W.G. Jr., Seto, E., 

10 Branton, P.E. (1999) Mol. Cell. Biol. 19, 6632-6641), Mad/Max (Laherty, C., W.- M. Yang, L- 
M. Sun, J. R Davie, E. Seto, and R. N. Eisenman. (1997) Cell 89, 349-456), nuclear hormone 
receptors (Nagy, L., H- Y. Kao, D. Chakravarti, R J. Lin, C. A. Hassig, D. E. Ayer, S. L. 
Schreiber, and R. M. Evans. (1997) Cell 89, 373-380), nucleosome remodeling factors (Xue, Y, 
Wong, J., Moreno, G.T., Young, M.K., Cote, J., Wang, W. (1998) Mol. Cell. 2, 851-861), 

1 5 methyl-binding proteins (Fuks, F., Burgers, W A., Brehm, A, Hughes-Davies, L, Kouzarides, T. 
(2000; Nat. Genet. 24, 88-91, Nan, X., Ng, H.H., Johnson, CA., Laherty CD., Turner, B.M., 
Eisenman, RN., Bird, A. (1998) Nature 393, 386-389, Ghosh, A.K., Steele, R, Ray, RB: (1999) 
Biochem. Biophys. Res. Commun. 260, 405-409, Ng, H. H., Zhang, Y, Hendrich, B., Johnson, 
CA., Turner, B.M., Erdjument-Bromage, H., Tempst, P., Reinberg, D., Bird, A. (1999) Nat. 

20 Genet. 23, 58-61), and DNA repair machinery proteins (Yarden, RL, Brody, L.C. (1 999) Proc. 
Natl. Acad. Sci. U. S. A. 96, 4983-4988, Cai, RL., Yan-Neale, Y, Cueto, M.A., Xu, H., Cohen, 
D. (2000) J. Biol. Chem. 275, 27909-27916). Furthermore, HDAC1 has been found to bind 
directly to YY1 (Y ang, W.- M., Inouye, C, Zeng, Y, Bearss, D., and Seto, E. (1996) Proc. Natl. 
Acad. Sci. 93, 122845-12850) and Spl (Doetzlhofer, A, Rotheneder, H., Lagger, G., Koranda, 

25 M., Kurtev, V., Brosch, G., Wintersberger, E., Seiser, C. (1999) Mol. Cell Biol. 19, 5504-551 1) 
and HDACs 4 and 5 bind to MEF2 (Grozinger, C. M, and Schreiber, S. L. (2000) Proc. Natl. 
Acad. Sci. 97, 7835-7840). In addition, HDACs have been found together in complexes (Eilers, 
A.L., Billin, A.N., Liu, J., Ayer, D.E. (1999) J Biol Chem 274, 32750-32756, Grozinger, C. M., 
and Schreiber, S. L. (2000) Proc. Natl. Acad. Sci. 97, 7835-7840). 
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Two distinct classes of yeast histone deacetylases have been identified based upon size 
and sequence. Yeast class I HDACs include Rpd3, Hoslp, and Hos2p. Class II contains yeast 
HDAlp. Furthermore, members of these two classes were found to form different complexes. 
Human HDACs have been classified based upon their similarity to yeast sequences. Class I 
5 human HDACs include HDACs 1-3 and 8. Class II HDACs include HDACs 4-7. The 

deacetylase core of class I HDACs reside in the first -390 amino acids. Class II HDAC catalytic 
domains are located in the C-terminal of these peptides, with the exception of HDAC4 that 
contains a second catalytic domain in the N-terminus (Grozinger, C. M., Hassig, C. A., and 
Schreiber, S. L. (1999) Proc. Natl. Acad. Sci. U. S. A. 96, 4868-4873). 

10 

An important approach that has been used to study the function of chromatin acetylation 
is the use of specific inhibitors of histone deacetylase. Several classes of compounds have been 
identified that inhibit HDAC. Histone deacetylase inhibitors have been found to have anti- 
proliferative effects, including induction of Gl/S and G2/M cell cycle arrest, differentiation 

15 (Itazaki, H., KL Nagashima, K. Sugita, H. Yoshida, Y. Kawamura, Y. Yasuda, K. Matsumoto, K. 
Ishii, N. Uotani, H. Nakai, A. Tend, S. Yoshimatsu, Y. Dcenishi and Y. Nakagawa. (1990) J. 
Antibiot. 12, 1524-1532, Hoshikawa, Y., Kijima, M., Yoshida, M., and Beppu, T. (1991) Agric. 
Biol Chem. 55, 1491-1497, Hoshikawa, Y., Kwon, H.- J., Yoshida, M., Horinouchi, S., and 
Beppu, T. (1994) Exp. Cell Res. 214, 189-197, Sugita, KL, Koizumi, K., and Yoshida, H. (1992) 

20 Cancer Res. 52, 168-172, Yoshida, M., Y. Hoshikawa, K. Koseki, K. Mori and T. Beppu. (1990) 
J. of Antibiot. 43, 1 101-106, Yoshida, M., Nomura, S., and Beppu, T. (1987) Cancer Res. 47, 
3688-3691), and apoptosis (Medina, V., Edmonds, B., Young, G. P., James, R., Appleton, S., 
Zalewski, P. D. (1997) Cancer Res. 57, 3697-3707) of transformed and normal cells and reversal 
of transformation (Kwon, H. J., Owa, T., Hassig, C. A., Shimada, J., and Schreiber, S. (1998) 

25 Proc. Natl Acad. Sci. U. S. A. 95, 3356-3361, Kim, M.-S., Son, M.-W., Park, Y. L, and Moon, 
A. (2000) Cancer Lett. 157, 23-30). These effects, along with the presence of HDAC in 
complexes with fusions of unliganded retinoic acid receptors PML-RARa and PLZF-RARo! 
indicate a role for HDACs in tumorigenicity (Grignani, R, De Matteis, S., Nervi, C, Tomassoni, 
L., Gelmetti, V., Cioce, M., Fanelli, M., Ruthardt, M., Ferrara, F. F., Zamir, L, Seiser, Cl, 

30 Grignani, F., Lazar, M. A., Minucci, S., Pelicci, P. G. (1998) Nature 391, 815-818, He, L. Z., 
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Guidez, R, Tribioli, C, Peruzzi, D., Ruthardt, M., Zelent, A., Pandolfi, P. P. (1998) Nat Genet, 
18, 126-35, Lin, RJ., Nagy, L., Inoue, S., Shao, W., Miller, W. H. Jr and Evans, IL M. (1998) 
Nature 391, 811-814). Furthermore, histone deacetylase inhibitors, phenylbutyrate and 
trichostatin A have shown promise in the treatment of promyelocytic leukemia and several other 
5 HDAC inhibitors are being studied and are nearing the clinic (Byrd, J.C., Shinn, C, Ravi, R., 
WilHs, C.R., Waselenko, J.KL, Flinn, I.W., Dawson, NA., Grever, M.R. (1999) Blood 94, 1401- 
1408, Kim, Y.B., Lee, K.H., Sugita, K., Yoshida, M., Horinouchi, S. (1999) Oncogene 18, 2461- 
2470, Cohen, L.A., Amin, S., Marks, PA., Rifkind, RA., Desai, D., Richon, V.M. (1999) 
Anticancer Res. 19, 4999-5005). In addition, the HDAC inhibitor, butyrate was found to decrease 

10 expression of pro-inflammatory cytokines TNF-a, TNF-jS, EL-6, and ILl-j8. These effects are 
thought to result from inhibition ofNFkB activation (Segain JP, Raingeard de la Bletiere D, 
Bourreille, A., Leray V., Gervois, N., Rosales, C, Ferrier, L., Bonnet, C, Blottiere, H.M., 
Galmiche, J.P. (2000) Butyrate inhibits inflammatory responses through NFkappaB inhibition: 
implications for Crohn's disease. Gut 47, 397-403) and its ability to inhibit histone deacetylases 

15 (Loan M.S., Rasoulpour, RJ., Yin, L., Hubbard, A.K., Rosenberg, D.W., Giardina, C. (2000). 
The luminal short-chain fatty acid butyrate modulates NF-kappaB activity in a human colonic 
epithelial cell line. Gastroenterology 118, 724-34). 



20 The discovery of the HDAC inhibitor trapoxin, made it possible to isolate the first human 

histone deacetylase, HDAC1, using an affinity matrix column to which a trapoxin-like molecule 
was bound (Taunton, J., Collins, J. L., and Schreiber, S. (1996) J. Am. Chem. Soc. 118, 10412- 
10422). Subsequently, seven other human HDAC enzyme isoforms were reported (Taunton, J., 
Hassig, C. A. and Schreiber, S.L. (1996). Science 272, 408-411, Yang, W. m., Inouye, C, Zeng, 

25 Y., Bearss, D., and Seto, D. (1996) Proc. Natl. Acad. Sci. U.S.A. 93, 12845-12850, Yang, W. M., 
Yao, y. L., Sun, J. M., Davie, J. R., and Seto, E. (1997). J. Biol Chem. 272, 28001-28007, 
Emiliani, S., Fischle, W., Van Lint, C, Al-Abed, Y., and Verdin, E. (1998). Proc. Natl. Acad. 
Set U.S.A. 95, 2795-27800). These 8 HDACs have been divided into class I ( HDACs 1-3 and 8 
similar to the yeast gene Rpd3) and class II HDACs (4-7 similar to yeast gene hdal (Grozinger, 

30 C. M., Hassig, C.A., and Schrieber, S. L. (1999). Proc. Natl. Acad. Sci. U.S.A. 96, 4983-4988.) 
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based on sequence homology. Here we report the isolation and characterization of a potential 
new HDAC, referred to herein as HDAC9, which displays sequence similarity to the hdal class 
II HDACs . HDAC9 has characteristics that bridge HDAC class I and class II. 

5 

Summary of the Invention 

The present invention relates to histone deacetylases, in particular to a novel histone 
deacetylase HDAC9. 

In a first aspect, the invention provides an isolated polypeptide comprising an amino acid 

10 sequence as set forth in SEQ ID NO: 1, SEQ ED NO 5 or SEQ ED NO 6 . Furthermore, the 

invention provides an isolated polypeptide consisting of an amino acid sequence as set forth in 
SEQ ID NO:l, SEQ ID NO 5 or SEQ ID NO 6. The amino acid sequence as set forth in SEQ ID 
NO: 1 ,SEQ ED NO 5 or SEQ ED NO 6 shows a considerable degree of homology to that of 
known members of the family of HDACs. For convenience, the polypeptide consisting of the 

15 amino acid sequence as set forth in SEQ ID NO: 1 SEQ ED NO 5 or SEQ ED NO 6 will be 
designated as histone deacetylase 9 or HDAC9. Such a polypeptide, or a fragment thereof, is 
expressed in various normal tissues, for example, HDAC9 was present in normal testes, stomach, 
spleen, small intestine, placenta, liver, kidney, colon, lung, heart, and brain, as an approximately 
3 kb transcript. HDAC9 was not detected in muscle, but this lane also did not hybridize GAPDH 

20 (Figure 7). Fragments of the isolated polypeptide having an amino acid sequence as set forth in 
SEQ ID NO: 1 ,SEQ ID NO 5 or SEQ ED NO 6 will comprise polypeptides comprising from 
about 5 to 148 amino acids, preferably from about 10 to about 143 amino acids, more preferably 
from about 20 to about 100 amino acids, and most preferably from about 20 to about 50 amino 
acids. Such fragments also form a part of the present invention. Preferably, fragments will 

25 encompass the catalytic domain, which is predicted to exist between amino acid number 1 to 
390. In accordance with this aspect of the invention there are provided novel polypeptides of 
human origin as well as biologically, diagnostically or therapeutically useful fragments, variants 
and derivatives thereof, variants and derivatives of the fragments, and analogs of the foregoing. 
In a second aspect, the invention provides an isolated DNA comprising a nucleotide 

30 sequence that encodes a polypeptide as mentioned above. In particular, the invention provides 
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(1) an isolated DNA comprising the nucleotide sequence as set forth in SEQ ID NO:2; SEQ ID 
NO 7 or SEQ ID NO 8 (2) an isolated DNA comprising the nucleotide sequence set forth in SEQ 
ID NO:3; (3) an isolated DNA capable of hybridizing under high stringency conditions to the 
nucleotide sequence set forth in SEQ ID NO:3; and (4) an isolated DNA comprising the 
nucleotide sequence set forth in SEQ ID NO:4. Also provided are nucleic acid sequences 
comprising at least about 15 bases, preferably at least about 20 bases, more preferably a nucleic 
acid sequence comprising about 30 contiguous bases of SEQ ID NO:2 , SEQ ID NO 7 or SEQ 
ID NO 8or SEQ ID NO:3. Also within the scope of the present invention are nucleic acids that 
are substantially similar to the nucleic acid with the nucleotide sequence as set forth in SEQ ID 
NO:2, SEQ ID NO 7 or SEQ ID NO 8 or SEQ ID NO:3. In a preferred embodiment, the 
isolated DNA takes the form of a vector molecule comprising at least a fragment of a DNA of 
the present invention, in particular comprising the DNA consisting of a nucleotide sequence as 
set forth in SEQ ID NO:2, SEQ ID NO 7 or SEQ ID NO 8 or SEQ ID NO:3. 

A third aspect of die present invention encompasses a method for the diagnosis of 
conditions associated with abnormal regulation of gene expression which includes, but is not 
limited to, conditions associated with abnormal cell proliferation, cancer, atherosclerosis, 
inflammatory bowel disease, or psoriasis in a human which comprises detecting abnormal 
transcription of messenger RNA transcribed from the natural endogenous human gene encoding 
the novel polypeptide consisting of the amino acid sequence set forth in SEQ ID NO: 1 ,SEQ ID 
NO 5 or SEQ ID NO 6 in an appropriate tissue or cell from a human, wherein such abnormal 
transcription is diagnostic of the human's affliction with such a condition. In particular, the said 
natural endogenous human gene encoding the novel polypeptide consisting of the amino acid 
sequence set forth in SEQ ID NO: 1 ,SEQ ID NO 5 or SEQ ID NO 6 comprises the genomic 
nucleotide sequence set forth in SEQ ID NO:4. In one embodiment of the present invention, the 
diagnostic method comprises contacting a sample of said appropriate tissue or cell or contacting 
an isolated RNA or DNA molecule derived from that tissue or cell with an isolated nucleotide 
sequence of at least about 15 - 20 nucleotides in length that hybridizes under high stringency 
conditions with the isolated nucleotide sequence encoding the novel polypeptide having an 
amino acid sequence set forth in SEQ ID NOs:l., 5 or 6 
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Another embodiment of the assay aspect of the invention provides a method for the 
diagnosis of a condition associated with abnormal HDAC9 activity in a human, which comprises 
measuring the level of deacetylase activity in a certain tissue or cell from a human suffering from 
such a condition, wherein the presence of an abnormal level of deacetylase activity, relative to 
the level thereof in the respective tissue or cell of a human not suffering from a condition 
associated with abnormal HDAC activity, is diagnostic of the human's suffering from said 
condition. 

In accordance with one embodiment of this aspect of the invention there are provided 
anti-sense polynucleotides that can regulate transcription of the gene encoding the novel 
HDAC9; in another embodiment, double stranded RNA is provided that can regulate the 
transcription of the gene encoding the novel HDAC9. 

Another aspect of the invention provides a process for producing the aforementioned 
polypeptides, polypeptide fragments, variants and derivatives, fragments of the variants and 
derivatives, and analogs of the foregoing. In a preferred embodiment of this aspect of the 
invention there are provided methods for producing the aforementioned HDAC9 comprising 
culturing host cells having incorporated therein an expression vector containing an exogenously- 
derived nucleotide sequence encoding such a polynucleotide under conditions sufficient for 
expression of the polypeptide in the host cell, thereby causing expression of the polypeptide, and 
optionally recovering the expressed polypeptide. In a preferred embodiment of this aspect of the 
present invention, there is provided a method for producing polypeptides comprising or 
consisting of an amino acid sequence as set forth in SEQ ID NOs: 1, 5 or 6 which comprises 
culturing a host cell having incorporated therein an expression vector containing an exogenously- 
derived polynucleotide encoding a polypeptide comprising or consisting of an amino acid 
sequence as set forth in SEQ ID NOs:l, 5 or 6 under conditions sufficient for expression of such 
a polypeptide in the host cell, thereby causing the production of an expressed polypeptide, and 
optionally recovering the expressed polypeptide. Preferably, in any of such methods the 
exogenously derived polynucleotide comprises or consists of the nucleotide sequence set forth in 
SEQ ID NOs:2, 7 or 8 the nucleotide sequence set forth in SEQ ID NO:3, or the nucleotide 
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sequence set forth in SEQ ID NO:4. In accordance with another aspect of the invention there are 
provided products, compositions, processes and methods that utilize the aforementioned 
polypeptides and polynucleotides for, inter alia, research, biological, clinical and therapeutic 
purposes. 

5 

hi certain additional preferred embodiments of this aspect of the invention there is 
provided an antibody or a fragment thereof which specifically binds to a polypeptide that 
comprises the amino acid sequence set forth in SEQ ID NOsrl, 5 or 6 i.e., all HDAC9 variants. 
In certain particularly preferred embodiments in this regard, the antibodies are highly selective 
10 for human HDAC9 polypeptides or portions of human HDAC9 polypeptides. 

In a further aspect, an antibody or fragment thereof is provided that binds to a fragment 
or portion of the amino acid sequence set forth in SEQ ID NOsrl, 5 or 6. 

15 In another aspect, methods of treating a condition in a subject, wherein the condition is 

associated with abnormal HDAC9 gene expression, an increase or decrease in the presence of 
HDAC9 polypeptide in a subject, or an increase or decrease in the activity of HDAC 9 
polypeptide, by the administration of an effective amount of an antibody that binds to a 
polypeptide with the amino acid sequence set out in SEQ ID NOs: 1, 5 or 6., or a fragment or 

20 portion thereof to the subject are provided. Also provided are methods for the diagnosis of a 
disease or condition associated with abnormal HDAC9 gene expression or an increase or 
decrease in the presence of the HDAC9 in a subject, or an increase or decrease in the activity of 
HDAC 9 polypeptide, which comprises utilizing conventional methodologies, including, for 
example, the H4 histone assay that was previously described (Inokoshi, J., Katagiri, M., Arima, 

25 S., Tanaka, HL, Hayashi, M., Kim, Y.-B., Furumai, R., Yoshida, M., Horinouchi, S., Omura, S. 
(1999) Biochem. Biophys. Res. Com. 256, 372-376.). 

hi yet another aspect, the invention provides host cells which can be propagated in vitro, 
preferably vertebrate cells, in particular mammalian cells, or bacterial cells, which are capable 
30 upon growth in culture of producing a polypeptide that comprises the amino acid sequence set 
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forth in SEQ ID NOs: 1, 5 or 6 or fragments thereof, where the cells contain transcriptional 
control DNA sequences, where the transcriptional control sequences control transcription of 
RNA encoding a polypeptide with the amino acid sequence according to SEQ ID NOs:l, 5 or 6. 
or fragments thereof. This includes, but is not limited to, the propagation of HDAC9 in a 
5 plasmid and the production of DNA, RNA or protein in human or insect cells or bacteria using 
the endogenous HDAC9 promoter or any other transcriptional control sequence. 

In yet another aspect of the present invention there are provided assay methods and kits 
comprising the components necessary to detect above-normal expression of polynucleotides 

10 encoding a polypeptide comprising an amino acid sequence as set forth in SEQ ID NOs: 1, 5 or 6. 
, or polypeptides comprising an amino acid sequence set forth in SEQ ID NOs: 1, 5 or 6. , or 
fragments thereof, in body tissue samples derived from a patient, such kits comprising e.g., 
antibodies that bind to a polypeptide comprising an amino acid sequence set forth in SEQ ID 
NOs: 1, 5 or 6 or to fragments thereof, or oligonucleotide probes that hybridize with 

15 polynucleotides of the invention. In a preferred embodiment, such kits also comprise 
instructions detailing the procedures by which the kit components are to be used. 

In another aspect, the invention is directed to use of a polypeptide comprising an amino 
acid sequence set forth in SEQ ID NOs:l, 5 or 6. or fragment thereof, polynucleotide encoding 
20 such a polypeptide or a fragment thereof, or antibody that binds to said polypeptide comprising 
an amino acid sequence set forth in SEQ ID NOs: 1, 5 or 6. or a fragment thereof in the 
manufacture of a medicament to treat diseases associated with abnormal HDAC activity or gene 
expression. 

25 Another aspect is directed to pharmaceutical compositions comprising a polypeptide 

comprising or consisting of an amino acid sequence set forth in SEQ ID NOs: 1, 5 or 6. or 
fragment thereof, a polynucleotide encoding such a polypeptide or a fragment thereof, or 
antibody that binds to such a polypeptide or a fragment thereof, in conjunction with a suitable 
pharmaceutical carrier, excipient or diluent, for the treatment of diseases associated with 

30 abnormal HDAC activity or gene expression. 
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In another aspect, the invention is directed to methods for the identification of molecules 
that can bind to a polypeptide comprising an amino acid sequence set forth in SEQ ED NOs: 1, 5 
or 6. and/or modulate the activity of a polypeptide comprising an amino acid sequence set forth 

5 in SEQ ID NOsrl, 5 or 6. or molecules that can bind to nucleic acid sequences that modulate the 
transcription or translation of a polynucleotide encoding a polypeptide comprising an amino acid 
sequence set forth in SEQ ID NOs:l, 5 or 6. Such methods are disclosed in, e.g., U.S. Patent 
Nos. 5,541,070; 5,567,317; 5,593,853; 5,670,326; 5,679,582; 5,856,083; 5,858,657; 5,866,341; 
5,876,946; 5,989,814; 6,010,861; 6,020,141; 6,030,779; and 6,043024, all of which are 

10 incorporated by reference herein in their entirety. Molecules identified by such methods also fall 
within the scope of the present invention. 

In a related aspect, the invention is directed to use of the novel HDAC9 to identify 
associated proteins in HDAC biologically relevant complexes. At present, the proteins that 
15 associate with HDAC9 are not known. However, these may be characterized by determining 

whether HDAC9 associates with proteins that have been previously shown to interact with other 
HDACs (see Introduction). For example, components of HDAC9 complexes may be determined 
using conventional methods, including co-immunoprecipitation (see Example 9). 

20 In yet another aspect, the invention is directed to methods for the introduction of nucleic 

acids of the invention into one or more tissues of a subject in need of treatment with the result 
that one or more proteins encoded by the nucleic acids are expressed and or secreted by cells 
within the tissue. 

25 Other objects, features, advantages and aspects of the present invention will become 

apparent to those of skill from the following description. It should be understood, however, that 
the following description and the specific examples, while indicating preferred embodiments of 
the invention, are given by way of illustration only. Various changes and modifications within 
the spirit and scope of the disclosed invention will become readily apparent to those skilled in the 
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art from reading the following description and from reading the other parts of the present 
disclosure. 

Brief Description of the drawings 
5 Figure 1 shows the 1 156 bp open reading frame that was identified using GENFAM 

(proprietary software) and used to search databases for the complete HDAC9 cDNA sequence. 
The respective ORF (SEQ ID NO:3) starts at nucleotide position no. 1 and ends at nucleotide 
position no. 1156. 

10 Figures 2A and 2B show the full length cDNA sequence (SEQ ID NO:2) of HDAC9 and 

the amino acid sequence (SEQ ID NO:l), respectively. The full length cDNA sequence starts at 
nucleotide position no. 1 and ends at nucleotide position 2022. 

Figure 3 shows the genomic DNA sequence in silico (AL022328) (SEQ ID NO:4), 
15 aligned with the sequence of clone 198929/HDAC9. The alignment was produced using 
proprietary software (Novartis Pharmaceuticals, Summit, NJ). 

Figure 4 is a depiction of the alignment of HDAC9 predicted peptide and 5. pombe Hdal 
peptide. The query is HDAC9 peptide and the subject is S. pombe Hdal peptide. The alignment 
20 was produced using Clustalw algorhithm (Higgins, D.G., Thompson, J.D., Gibson, TJ. (1996) 
Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383-402). 

Figure 5 shows the alignment of HDAC1 and HDAC9vl and locations of the putative 
catalytic domain amino acids and Rb-binding domain. Catalytic domain amino acids are boxed 
25 and putative Rb domain amino acids are contained within crosshatched boxes. The alignment 
was produced using Clustalw algorhithm (Higgins, D.G., Thompson, J.D., Gibson, TJ. (1996) 
Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383-402). 
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Figure 6 shows the alignment of HDACs l-9vl. The alignment was produced using 
Clustalw algoihithm (Higgins, D.G., Thompson, J.D., Gibson, TJ. (1996) Using CLUSTAL for 
multiple sequence alignments. Methods Enzymol 266, 383-402). 

5 Figure 7 shows the Northern analysis of HDAC9. (A) Northern blot analysis of the 

distribution of HDAC9 in normal human tissues. GAPDH was hybridized to the same blot as a 
control for RNA loading. (B) Northern blot analysis of HDAC9 in matched tumor and normal 
tissues. GAPDH was hybridized to the same blot as a control for RNA loading. 

10 Figure 8 shows Real Time PCR analysis of the distribution of HDAC9 in normal human 

tissues and cell lines relative to 18S ribosomal RNA. RNA from the human lung carcinoma cell 
line, A549 was used as an internal control. 

Figure 9 shows the alignment of HDAC9vl with class II HDACs (HDACs 4,5,6, 7). The 
15 alignment was produced using Clustalw algorhithm (Higgins, D.G., Thompson, J.D., Gibson, 
TJ. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383- 
402). Catalytic domain amino acids are boxed. 

Figure 10 shows the alignment of HDAC9vl with class I HDACs (HDACs 1,2,3,8). The 
20 alignment was produced using Clustalw algorhithm (Higgins, D.G., Thompson, J.D., Gibson, 
TJ. (1996) Using CLUSTAL for multiple sequence alignments. Methods Enzymol 266, 383- 
402). Catalytic domain amino acids are boxed. 

Figure 1 1 There are threee HDAC9 sequence variants (HDAC9vl, HDAc9v2, and 
25 HDAC9v3). HDAC9vl and HDA9v2 were found by searching the human EST database and 
HDAC9v3 was found as a predicted transcript in the Celera Sequence database. (A) shows an 
alignment of the 3 HDAC9 variant peptide sequences. (B) shows a schematic of class I and class 
II HDAC peptide sequences. Catalytic domains are in filled boxes and putative LXCXE motifs 
are in open boxes (C) is a schematic of the genomic structures of HDAC9vl and HDAC9v2. 
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Exons are shown as filled boxes and introns are shown as lines between the filled boxes. Lengths 
of boxes and lines represent the lengths of exons and introns. 

Figure 12 shows that HDAC9 is an enzymatically active histone deacetylase. (A) 
5 HDAC9 catalytic activity is comparable to the activity of HDAC3 and HDAC4. (B) shows that 
HDAC1 was more efficient than HDAC3, HDAC4, and HDAC9 at deacetylating the histone 
substrate in this assay. 



Figure 13 shows that HDAC9 is a nuclear protein and shows that HDAC9-flag is in vitro 
10 translated. 



Figure 14 shows DNA and peptide sequences for HDAC9v3 and HDAC9v2. 



Detailed Description of the Invention 

15 

All patent applications, patents and literature references cited herein are hereby 
incorporated by reference in their entirety. 

In practicing the present invention, many conventional techniques in molecular biology, 
microbiology, and recombinant DNA are used. These techniques are well known and are 

20 explained in, for example, Current Protocols in Molecular Biology, Volumes I, n, and m, 1997 
(F. M. Ausubel ecL); Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, Second 
Edition, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N.Y.; DNA Cloning: A 
Practical Approach, Volumes I and II, 1985 (D. N. Glover ed.); Oligonucleotide Synthesis, 1984 
(M. L. Gait ed.); Nucleic Acid Hybridization, 1985, (Hames and Higgins); Transcription and 

25 Translation, 1984 (Hames and Higgins eds.); Animal Cell Culture, 1986 (R. I. Freshney ed.); 
Immobilized Cells and Enzymes, 1986 (IRL Press); Perbal, 1984, A Practical Guide to 
Molecular Cloning; the series, Methods in Enzymology (Academic Press, Inc.); Gene Transfer 
Vectors for Mammalian Cells, 1987 (J. H. Miller and M. P. Calos eds., Cold Spring Harbor 
Laboratory); and Methods in Enzymology Vol. 154 and Vol. 155 (Wu and Grossman, and Wu, 

30 eds. , respectively) . 
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The following abbreviations used throughout the disclosure are listed herein below: 
histone deacetylase (HDAQ, histone deacetylase-like protein (HDLP) 

In its broadest sense, the term "substantially similar", when used herein with respect to a 
nucleotide sequence, means a nucleotide sequence corresponding to a reference nucleotide 
sequence, wherein the corresponding sequence encodes a polypeptide having substantially the 
same structure and function as the polypeptide encoded by the reference nucleotide sequence, 
e.g. where only changes in amino acids not affecting the polypeptide function occur. Desirably 
the substantially similar nucleotide sequence encodes the polypeptide encoded by the reference 
nucleotide sequence. The percentage of identity between the substantially similar nucleotide 
sequence and the reference nucleotide sequence desirably is at least 80%, more desirably at least 
85%, preferably at least 90%, more preferably at least 95%, still more preferably at least 99%. 
Sequence comparisons are carried out using Clustalw (see, for example, Higgins, D.G. et al. 
Methods Enzymol. 266:383-402 (1996)). Clustalw alignments were performed using default 
parameters. 

A nucleotide sequence "substantially similar" to reference nucleotide sequence 
hybridizes to the reference nucleotide sequence in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaP0 4 , 1 mM EDTA at 50°C with washing in 2X SSC, 0.1% SDS at 50°C, more desirably in 
7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in IX 
SSC, 0.1% SDS at 50°C, more desirably still in 7% sodium dodecyl sulfate (SDS), 0.5 M 
NaP0 4 , 1 mM EDTA at 50°C with washing in 0.5X SSC, 0. 1% SDS at 50°C, preferably in 7% 
sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM EDTA at 50°C with washing in 0. IX SSC, 
0.1% SDS at 50°C, more preferably in 7% sodium dodecyl sulfate (SDS), 0.5 M NaP0 4 , 1 mM 
EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 65°C, yet still encodes a functionally 
equivalent gene product. 

"Elevated transcription of mRNA" refers to a greater amount of messenger RNA 
transcribed from the natural endogenous human gene encoding the novel polypeptide of the 
present invention present in an appropriate tissue or cell of an individual suffering from a 
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condition associated with abnormal HDAC9 activity than in a subject not suffering from such a 
disease or condition; in particular at least about twice, preferably at least about five times, more 
preferably at least about ten times, most preferably at least about 100 times the amount ofmRNA 
found in corresponding tissues in humans who do not suffer from such a condition. Such 
elevated level ofmRNA may eventually lead to increased levels of protein translated from such 
mRNA in an individual suffering from a condition associated with abnormal cellular 
proliferation as compared with a healthy individual. It is also understood that "elevated 
transcription ofmRNA" may refer to a greater amount of messenger RNA transcribed from 
genes the expression of which is modulated by HDAC9 either alone or in combination with other 
molecules. 

A cc host cell," as used herein, refers to a prokaryotic or eukaryotic cell that contains 
heterologous DNA that has been introduced into the cell by any means, e.g., electroporation, 
calcium phosphate precipitation, microinjection, transformation, viral infection, and the like. 

"Heterologous" as used herein means "of different natural origin' 1 or represent a non- 
natural state. For example, if a host cell is transformed with a DNA or gene derived from 
another organism, particularly from another species, that gene is heterologous with respect to 
that host cell and also with respect to descendants of the host cell which carry that gene. 
Similarly, heterologous refers to a nucleotide sequence derived from and inserted into the same 
natural, original cell type, but which is present in a non-natural state, e.g. a different copy 
number, or under the control of different regulatory elements. 

A "vector" molecule is a nucleic acid molecule into which heterologous nucleic acid may 
be inserted which can then be introduced into an appropriate host cell. Vectors preferably have 
one or more origin of replication, and one or more site into which the recombinant DNA can be 
inserted. Vectors often have convenient means by which cells with vectors can be selected from 
those without, e.g., they encode drug resistance genes. Common vectors include plasmids, viral 
genomes, and (primarily in yeast and bacteria) "artificial chromosomes." 

"Plasmids" generally are designated herein by a lower case p preceded and/or followed 
by capital letters and/or numbers, in accordance with standard naming conventions that are 
femiliar to those of skill in the art. Starting plasmids disclosed herein are either commercially 
available, publicly available on an unrestricted basis, or can be constructed from available 
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plasmids by routine application of well known, published procedures. Many plasmids and other 
cloning and expression vectors that can be used in accordance with the present invention are well 
known and readily available to those of skill in the art. Moreover, those of skill readily may 
construct any number of other plasmids suitable for use in the invention. The properties, 

5 construction and use of such plasmids, as well as other vectors, in the present invention will be 
readily apparent to those of skill from the present disclosure. 

The term "isolated" means that the material is removed from its original environment 
(e.g., the natural environment if it is naturally occurring). For example, a naturally-occurring 
polynucleotide or polypeptide present in a living animal is not isolated, but the same 

10 polynucleotide or polypeptide, separated from some or all of the coexisting materials in the 
natural system, is isolated, even if subsequently reintroduced into the natural system. Such 
polynucleotides could be part of a vector and/or such polynucleotides or polypeptides could be 
part of a composition, and still be isolated in that such vector or composition is not part of its 
natural environment. 

15 As used herein, the term "transcriptional control sequence" refers to DNA sequences, 

such as initiator sequences, enhancer sequences, and promoter sequences, which induce, repress, 
or otherwise control the transcription of protein encoding nucleic acid sequences to which they 
are operably linked. 

As used herein, "human transcriptional control sequences*' are any of those 

20 transcriptional control sequences normally found associated with the human gene encoding the 
novel HDAC9 polypeptide of the present invention as it is found in the respective human 
chromosome. It is understood that the term may also refer to transcriptional control sequences 
normally found associated with human genes the expression of which is modulated by HDAC9 
either alone or in combination with other molecules. 

25 As used herein, '^non-human transcriptional control sequence" is any transcriptional 

control sequence not found in the human genome. 

The term polypeptide" is used interchangeably herein with the terms polypeptides" and 
"protein(s)". 

As used herein, a "chemical derivative" of a polypeptide of the invention is a polypeptide 
30 of the invention that contains additional chemical moieties not normally a part of the molecule. 
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Such moieties may improve the molecule's solubility, absorption, biological half life, etc. The 
moieties may alternatively decrease the toxicity of the molecule, eliminate or attenuate any 
undesirable side effect of the molecule, etc. Moieties capable of mediating such effects are 
disclosed, for example, in Remington's Pharmaceutical Sciences, 16th ed., Mack Publishing Co., 
Easton, Pa. (1980). 

As used herein, "HDAC9" refers to the amino acid sequences of substantially purified 
HDAC9 obtained from any species, particularly mammalian, including bovine, ovine, porcine, 
murine, equine, and preferably human, from any source, whether natural, synthetic, semi- 
synthetic, or recombinant. 

As used herein, "HDAC activity", including "HDAC9 activity" refers to the ability of an 
HDAC polypeptide to deacetylate histone proteins, including 3 H-labeled H4 histone peptide. 
Such activity may be measured according to conventional methods, for example as described in 
Inokoshi, J., Katagiri, M., Arima, S., Tanaka, H., Hayashi, M., Kim, Y.-B., Furumai, R., 
Yoshida, M., Horinouchi, S., and Omura, S. (1999) Biochem. Biophys. Res. Com. 256, 372-376. 
A biologically "active" protein refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. 

The term "agonist", as used herein, refers to a molecule which when bound to HDAC9, 
causes a change in HDAC9 which modulates the activity of HDAC9.. Agonists may include 
proteins, nucleic acids, carbohydrates, or any other molecules that bind to HDAC9. 

The terms "antagonist" or "inhibitor" as used herein, refer to a molecule which when 
bound to HDAC9, blocks or modulates the biological activity of HDAC9. Antagonists and 
inhibitors may include proteins, nucleic acids, carbohydrates, or any other molecules, natural or 
synthetic that bind to HDAC9. 

HDAC9 was identified using proprietary computer software called GENFAM to search 
for new human sequences that are related to histone deacetylases in the Celera Human Genome 
Database, Incyte LIFESEQ® database and the public High Throughput Genomic database. An 
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1 156 bp open reading frame (ORF) was identified and used to search a database of sequenced 
clones from pan-tissue and dorsal root ganglion cDNA libraries. Four clones were found to 
contain the ORF (M6, K10, P3, F23), two from each library. Of these clones, M6, from the pan- 
tissue library was determined to be the most complete cDNA as a result of sequence analysis and 
in vitro translation. BLAST (Altshul S.F. et al Nucleic Acid Res 25:3389-402 (1997)) was used 
to search the Genbank database using cDNA clone M6. Genomic sequence AL022328 was found 
to contain exons that were identical in sequence to the M6 cDNA. A Clustalw alignment of the 
antisense sequence of HDAC9 (2022 to 8) with genomic sequence AL022328 is shown in Figure 
3. The first 7 bases of the HDAC9 predicted cDNA are not aligned, presumably because they 
occur following the next intron and this sequence was probably too short for the software to 
determine an alignment. The sequence of cDNA clone M6 was confirmed by automated DNA 
sequencing (ACGT, Inc., Northbrook, IL). Based upon the predicted cDNA sequence from 
genomic sequence AL022328, 44 bases were missing from the N-terminus of M6. This sequence 
was subsequently added by PCR. 

The full length cDNA for HDAC9 predicts a protein of 673 amino acids. The HDAC9 
cDNA sequence is 2022 base pairs in length. In order to determine the percent similarity of 
HDAC9 to other known HDACs, a Clustalw multiple sequence alignment was performed using 
complete peptide sequences for HDACs 1-9. HDAC9 is most similar in peptide sequence to 
human HDAC6 at 37%. The Clustalw alignment of HDAC9 with class II HDACs is shown in 
Figure 9. HDAC9 was also 40% similar to a yeast class II sequence hdal from S. pombe. The 
Clustalw alignment of human HDAC9 and S. pombe is shown in Figure 4. HDAC9 was less 
similar to class I HDACs (<18%). The Clustalw alignment of HDAC9 to class I HDACs is 
shown in Figure 10. HDAC9 possesses a putative catalytic domain which encompasses 
approximately 317 aa (-6 to 323) based upon alignments of HDAC9 with the putative catalytic 
domains of all of the other known HDACs. To identify the catalytic domain of HDAC9, 
Clustalw alignments were performed separately using HDAC9 complete peptide and catalytic 
domain sequences from class I HDACs (1-3 and 8) or class H HDACs (4-7). 13 amino acids 
were previously shown to confer deacetylase activity, based upon inactivation by single amino 
acid mutations and the three dimensional structure formed by a complex of HDAC-like protein 
(HDLP), Zn2+ and HDAC inhibitors (Finnin, M. S., Donigjan, J. R., Cohen, A., Richon, V. M., 
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Riflrind, R. a., Marks, P. A., Breslow, R., and Pavletich, N. P. (1999) Structures of a histone 
deacetylase homologue bound to TSA and SAHA inhibitors. Nature 401, 188-193). These 13 
amino acids include Pro 22, His 131, His 132, Gly 140, Phe 141, Asp 166, Asp 168, His 170, 
Asp 173, Phe 198, Asp 258, Leu 265, and Tyr 297. 12 out of 13 of these amino acids are 
5 conserved in HDAC9. The amino acid that is not conserved is Leu 265 . This hydrophobic 
residue forms part of the TS binding pocket and is replaced in HDAC9 with Glu at amino acid 
272. Leu 265 is replaced with Met in HDAC8 and Lys in HDAC 6 domain 1 . This suggests that 
this residue is not highly conserved and need not be identical to other HDACs. The second 
residue that differs from HDLP, HDAC1, and HDAC2, Asp 173 is substituted with Gin at 
10 position 177 in HDAC9, a difference that is also present in the HDAC6 catalytic domain 1 . 

Furthermore, Asp 173 is substituted with Asn in HDACs 4,5, 6 (domain 2), and 7. This evidence 
suggests that these Asp 1 73 substitutions do not affect HDAC activity. 

An amino acid sequence motif was previously found to be important for the binding of 
HDACs 1 and 2 to retinoblastoma protein (Rb). Complexes of HDACs 1 and 2 and Rb induce 
15 repression of E2F responsive promoters (Brehm, A., Miska, E. A., McCance, D. J., Reid, J. L., 
Bannister, A. J., and Kouzarides, T. (1998) Nature 391, 597-601). An Rb-binding motif fits the 
sequence model LXCXE, where "X" can be any amino acid. The LXCXE domain has been 
found to be dispensible for growth suppression function of Rb, but is necessary for HDAC 
binding (Chen, T.-T. and Wang, J. Y. J. (2000) Mol Cell Biol 20, 5571-5580). The Rb-binding 
20 domain that was previously determined in HDAC1 is located from amino acid 414 to amino acid 
419 and is the sequence IACEE. So far, it has not been determined whether other HDACs are 
capable of binding to Rb. However, HDAC 9 contains a putative Rb-binding motif; LSCIL, that 
aligned with HDAC1 IACEE and is located between amino acids 560 and 564. Co- 
immunoprecipitation of HDAC9 with Rb is one strategy that may be used to validate the 
25 function of this motif in HDAC9. 

As a member of the HDAC family, HDAC9 could form biologically relevant complexes 
with proteins and display functions that have been described for other HDACs. For example, it is 
likely to be involved in the regulation of transcription as a component of complexes that are 
involved in transcriptional repression that is mediated through interactions of HDACs with 
30 multi-protein complexes and which requires deacetylase activity. Thus, increased activity or 
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expression of HDAC9 may be associated with numerous pathological conditions, including but 
not limited to, abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, 
host inflammatory or immune response, or psoriasis. 

Thus, the DNA/amino acid sequence and predicted structure of HDAC9 will be useful for 
designing agents (e.g. antagonists or inhibitors) useful to ameliorate conditions associated with 
abnormal HD AC activity. These may include, for example, antiproliferative or 
antiinflammatory agents either through the use of small molecules or proteins (e.g. antibodies) 
directed against it or associated proteins in HDAC transcription repressor complexes. In 
addition, protein derived from the HDAC9 sequence may also be used as a therapeutic to modify 
host cell proliferative or inflammatory responses. 

To determine the expression pattern of the novel polypeptide, a panel of mRNAs from a 
variety of human tissues is subjected to Northern analysis. Data indicate that HDAC9 is 
expressed in human tissues, being detectable in brain, colon, heart, kidney, liver, placenta, small 
intestine, spleen, stomach and testes. Thus, HDAC9 represents a transcribed gene. 

Therefore, in one aspect, the present invention relates to a novel histone deacetylase 
(HDAC). As outlined above, HDAC9 is clearly a member of the HDAC family since it is highly 
similar to other HDAC proteins in the hdal class H HDACs. It also shares many similarities 

with the HDAC family. 

The present invention relates to an isolated polypeptide comprising the amino acid 
sequence set forth in SEQ ID NO:l. For example, such a polypeptide may be a fusion protein 
including the amino acid sequence of the novel HDAC9. In another aspect the present invention 
relates to an isolated polypeptide consisting of the amino acid sequence set forth in SEQ ID 
NO:l, which is, in particular, the novel HDAC9. 

The invention includes nucleic acid or nucleotide molecules, preferably DNA molecules, 
in particular encoding the novel HDAC9. Preferably, an isolated nucleic acid molecule, 
preferably a DNA molecule, of the present invention encodes a polypeptide comprising the 
amino acid sequence set forth in SEQ ID NO: 1 SEQ ID NO 5 or SEQ ID NO 6. Likewise 
preferred is an isolated nucleic acid molecule, preferably a DNA molecule, encoding a 
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polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l, SEQ ID NO 5 or 
SEQ ID NO 6. Such a nucleic acid or nucleotide, in particular such a DNA molecule, preferably 
comprises a nucleotide sequence selected from the group consisting of (1) the nucleotide 
sequence as set forth in SEQ ID NO:2„ 7 or 8 which is the complete cDNA sequence encoding 
the polypeptide consisting of the amino acid sequence set forth in SEQ ID NO:l, 5 and 6, 
respectively, (2) the nucleotide sequence set forth in SEQ ID NO:3, which corresponds to the 
open reading frame of the cDNA sequence set forth in SEQ ID NO:2; (3) a nucleotide sequence 
capable of of hybridizing under high stringency conditions to a nucleotide sequence set forth in 
SEQ ID NO:3; and (4) the nucleotide sequence set form in SEQ ID NO:4, which corresponds to 
the endogenous genomic human DNA encoding the polypeptide consisting of the amino acid 
sequence set forth in SEQ ID NO: 1 . Such hybridization conditions may be highly stringent or 
less highly stringent, as described above. In instances wherein the nucleic acid molecules are 
deoxyoligonucleotides ("oligos"), highly stringent conditions may refer, e.g., to washing in 6X 
SSC/0.05% sodium pyrophosphate at 37 °C (for 14-base oligos), 48 °C (for 17-base oligos), 55 
°C (for 20-base oligos), and 60 °C (for 23-base oligos). Suitable ranges of such stringency 
conditions for nucleic acids of varying compositions are described in Krause and Aaronson 
(1991), Methods in Enzymology, 200:546-556 in addition to Maniatis et al., cited above. 

These nucleic acid molecules may act as target gene antisense molecules, useful, for 
example, in target gene regulation and/or as antisense primers in amplification reactions of target 
gene nucleic acid sequences. Further, such sequences may be used as part of ribozyme and/or 
triple helix sequences, also useful for target gene regulation. Still further, such molecules may be 
used as components of diagnostic methods whereby the presence of an allele causing a disease 
associated with abnormal HDAC9 expression or activity, for example, abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host infl a mma tory or immune 
response, or psoriasis, may be detected. 

The invention also encompasses (a) vectors that contain at least a fragment of any of the 
foregoing nucleotide sequences and/or their complements (i.e., antisense); (b) vector molecules, 
preferably vector molecules comprising transcriptional control sequences, in particular 
expression vectors, that contain any of the foregoing coding sequences operatively associated 
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with a regulatory element that directs the expression of the coding sequences; and (c) genetically 
engineered host cells that contain a vector molecule as mentioned herein or at least a fragment of 
any of the foregoing nucleotide sequences operatively associated with a regulatory element that 
directs the expression of the coding sequences in the host cell. As used herein, regulatory 
elements include, but are not limited to, inducible and non-inducible promoters, enhancers, 
operators and other elements known to those skilled in the art that drive and regulate expression. 
Preferably, host cells can be vertebrate host cells, preferably mammalian host cells, such as 
human cells or rodent cells, such as CHO or BHK cells. Likewise preferred, host cells can be 
bacterial host cells, in particular E.coli cells. 

Particularly preferred is a host cell, in particular of the above described type, which can 
be propagated in vitro and which is capable upon growth in culture of producing an HDAC9 
polypeptide, in particular a polypeptide comprising or consisting of an amino acid sequence set 
forth in SEQ ID NO:l, wherein said cell contains some fragment or complete sequence of 
HDAC9 coding sequence in a construct that is controlled by one or more transcriptional control 
sequences that is not a transcriptional control sequence of the natural endogeneous human gene 
encoding said polypeptide, wherein said one or more transcriptional control sequences control 
transcription of a DNA encoding said polypeptide. Possible transcriptional control sequences 
include, but are not limited to, bacterial or viral promoter sequences. 

The invention includes the complete sequence of the gene as well as fragments of any of 
the nucleic acid sequences disclosed herein. Fragments of the nucleic acid sequences encoding 
the novel HDAC9 polypeptide may be used as a hybridization probe for a cDNA library to 
isolate other genes which have a high sequence similarity to the HDAC9 gene or similar 
biological activity. Probes of this type preferably have at least about 30 bases and may contain, 
for example, from about 30 to about 50 bases, about 50 to about 100 bases, about 100 to about 
200 bases, or more than 200 bases. The probe may also be used to identify a cDNA clone 
corresponding to a full length transcript and a genomic clone or clones that contain the complete 
HDAC9 gene including regulatory and promoter regions, exons, and introns. An example of a 
screen comprises isolating the coding region of the HDAC9 gene by using the known DNA 
sequence to synthesize an oligonucleotide probe. Labeled oligonucleotides having a sequence 
complementary to that of the gene of the present invention may be used to screen a library of 
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human cDNA, genomic DNA or mRNA to determine which members of the library to which the 
probe hybridizes. 

In addition to the gene sequences described above, homologs of such sequences, as may, 
for example, be present in other species, may be identified and may be readily isolated, without 

5 undue experimentation, by molecular biological techniques well known in the art Further, there 
may exist genes at other genetic loci within the genome that encode proteins which have 
homology to one or more domains of such gene products. These genes may also be identified via 
similar techniques. For example, the isolated nucleotide sequence of the present invention 
encoding the novel HDAC9 polypeptide may be labeled and used to screen a cDNA library 

10 constructed from mRNA obtained from the organism of interest. Hybridization conditions will 
be of a lower stringency when the cDNA library is derived from an organism different from the 
type of organism from which the labeled sequence was derived. Alternatively, the labeled 
fragment may be used to screen a genomic library derived from the organism of interest, again, 
using appropriately stringent conditions. Such low stringency conditions will be well known to 

15 those of skill in the art, and will vary predictably depending on the specific organisms from 

which the library and the labeled sequences are derived. For guidance regarding such conditions 
see, for example, Sambrook et al. cited above. 

Further, a previously unknown differentially expressed gene-type sequence may be 
isolated by performing PCR using two degenerate oligonucleotide primer pools designed on the 

20 basis of amino acid sequences within the gene of interest. The template for the reaction may be 
cDNA obtained by reverse transcription of mRNA prepared from human or non-human cell lines 
or tissue known or suspected to express a differentially expressed gene allele. The PCR product 
may be subcloned and sequenced to ensure that the amplified sequences represent the sequences 
of a differentially expressed gene-like nucleic acid sequence. The PCR fragment may then be 

25 used to isolate a complete cDNA clone by a variety of conventional methods. For example, the 
amplified fragment may be labeled and used to screen a bacteriophage cDNA library. 
Alternatively, the labeled fragment may be used to screen a genomic library. 

PCR technology may also be utilized to isolate full length cDNA sequences. For 
example, RNA may be isolated, following standard procedures, from an appropriate cellular or 

30 tissue source. A reverse transcription reaction may be performed on the RNA using an 
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oligonucleotide primer specific for the most 5 f end of the amplified fragment for the priming of 
first strand synthesis. The resulting RNA/DNA hybrid may then be "tailed" with guanines using 
a standard terminal transferase reaction, the hybrid may be digested with RNAase H, and second 
strand synthesis may then be primed with a poly-C primer. Thus, cDNA sequences upstream of 

5 the amplified fragment may easily be isolated. For a review of cloning strategies which may be 
used, see e.g., Sambrook et al., 1989, supra. 

In cases where the gene identified is the normal, or wild type, gene, this gene may be 
used to isolate mutant alleles of the gene. Such an isolation is preferable in processes and 
disorders which are known or suspected to have a genetic basis. Mutant alleles may be isolated 

10 from individuals either known or suspected to have a genotype which contributes to disease 
symptoms related to abnormal HDAC activity, including, but not limited to, conditions such as 
abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, host 
inflammatory or immune response, or psoriasis. Mutant alleles and mutant allele products may 
then be utilized in the diagnostic assay systems described below. 

1 5 A cDNA of the mutant gene may be isolated, for example, by using PCR, a technique 

which is well known to those of skill in the art. In this case, the first cDNA strand may be 
synthesized by hybridizing an oligo-dT oligonucleotide to mRNA isolated from tissue known or 
suspected to be expressed in an individual putatively carrying the mutant allele, and by extending 
the new strand with reverse transcriptase. The second strand of the cDNA is then synthesized 

20 using an oligonucleotide that hybridizes specifically to the 5' end of the normal gene. Using these 
two primers, the product is then amplified via PCR, cloned into a suitable vector, and subjected 
to DNA sequence analysis through methods well known to those of skill in the art By comparing 
the DNA sequence of the mutant gene to that of the normal gene, the mutation(s) responsible for 
the loss or alteration of function of the mutant gene product can be ascertained. 

25 Alternatively, a genomic or cDNA library can be constructed and screened using DNA or 

RNA, respectively, from a tissue known to or suspected of expressing the gene of interest in an 
individual suspected of or known to carry the mutant allele. The normal gene or any suitable 
fragment thereof may then be labeled and used as a probe to identify the corresponding mutant 
allele in the library. The clone containing this gene may then be purified through methods 

30 routinely practiced in the art, and subjected to sequence analysis as described above. 
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Additionally, an expression library can be constructed utilizing DNA isolated from or 
cDNA synthesized from a tissue known to or suspected of expressing the gene of interest in an 
individual suspected of or known to carry the mutant allele. In this manner, gene products made 
by the putatively mutant tissue may be expressed and screened using standard antibody screening 

5 techniques in conjunction with antibodies raised against the normal gene product, as described 
below. (For screening techniques, see, for example, Harlow, E. and Lane, eds., 1988, 
"Antibodies: A Laboratory Manual", Cold Spring Haibor Press, Cold Spring Harbor.) In cases 
where the mutation results in an expressed gene product with altered function (e.g., as a result of 
a missense mutation), a polyclonal set of antibodies are likely to cross-react with the mutant gene 

10 product Library clones detected via their reaction with such labeled antibodies can be purified 
and subjected to sequence analysis as described above. 

The present invention includes those proteins encoded by nucleotide sequences set forth 
in any of SEQ ID NOs:2, 3, 4, 7 or 8 in particular, a polypeptide that is or includes the amino 
acid sequence set out in SEQ ID NO:l, 5 or 6 or fragments thereof. 

1 5 Furthermore, the present invention includes proteins that represent functionally 

equivalent gene products. Such an equivalent differentially expressed gene product may contain 
deletions, additions or substitutions of amino acid residues within the amino acid sequence 
encoded by the differentially expressed gene sequences described, above, but which result in a 
silent change, thus producing a functionally equivalent differentially expressed gene product. 

20 Amino acid substitutions may be made on the basis of similarity in polarity, charge, solubility, 
hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues involved. 

For example, nonpolar (hydrophobic) amino acids include alanine, leucine, isoleucine, 
valine, proline, phenylalanine, tryptophan, and methionine; polar neutral amino acids include 
glycine, serine, threonine, cysteine, tyrosine, asparagine, and glutamine; positively charged 

25 (basic) amino acids include arginine, lysine, and histidine; and negatively charged (acidic) amino 
acids include aspartic acid and glutamic acid. "Functionally equivalent," as utilized herein, may 
refer to a protein or polypeptide capable of exhibiting a substantially similar in vivo or in vitro 
activity as the endogenous differentially expressed gene products encoded by the differentially 
expressed gene sequences described above. "Functionally equivalent" may also refer to proteins 

30 or polypeptides capable of interacting with other cellular or extracellular molecules in a maimer 
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substantially similar to the way in which the corresponding portion of the endogenous 
differentially expressed gene product would. For example, a 'functionally equivalent" peptide 
would be able, in an immunoassay, to diminish the binding of an antibody to the corresponding 
peptide (i.e., the peptide the amino acid sequence of which was modified to achieve the 
5 'functionally equivalent" peptide) of the endogenous protein, or to the endogenous protein itself, 
where the antibody was raised against the corresponding peptide of the endogenous protein. An 
equimolar concentration of the functionally equivalent peptide will diminish the aforesaid 
binding of the corresponding peptide by at least about 5%, preferably between about 5% and 
10%, more preferably between about 10% and 25%, even more preferably between about 25% 
1 0 and 50%, and most preferably between about 40% and 50%. 

The polypeptides of the present invention may be produced by recombinant DNA 
technology using techniques well known in the art. Therefore, there is provided a method of 
producing a polypeptide of the present invention, which method comprises culturing a host cell 
having incorporated therein an expression vector containing an exogenously-derived 
15 polynucleotide encoding a polypeptide comprising an amino acid sequence as set forth in SEQ 
ID NOs: 1, 5 or 6 under conditions sufficient for expression of the polypeptide in the host cell, 
thereby causing the production of the expressed polypeptide. Optionally, said method further 
comprises recovering the polypeptide produced by said cell. In a preferred embodiment of such a 
method, said exogenously-derived polynucleotide encodes a polypeptide consisting of an amino 
20 acid sequence set forth in SEQ ID NOs: 1 , 5 or 6 Preferably, said exogenously-derived 

polynucleotide comprises the nucleotide sequence as set forth in any of SEQ ID NO:2, SEQ ID 
NO:3, SEQ ID NO:4, SEQ ID NO: 7 or SEQ ID NO: 8. In case of using the nucleotide 
sequence set forth in SEQ ID NO:3, i.e. the open reading frame, the sequence, when inserted into 
a vector, may be followed by one or more appropriate translation stop codons, preferably by the 
25 natural endogenous stop codon TGA beginning at nucleotide 2021 in the cDNA sequence. 

Thus, methods for preparing the polypeptides and peptides of the invention by expressing 
nucleic acid encoding respective nucleotide sequences are described herein. Methods which are 
well known to those skilled in the art can be used to construct expression vectors containing 
protein coding sequences and appropriate transcriptional/translational control signals. These 
30 methods include, for example, in vitro recombinant DNA techniques, synthetic techniques and in 
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vivo recombination/genetic recombination. See, for example, the techniques described in 
Sambrook et al., 1989, supra, and Ausubel et al., 1989, supra. Alternatively, RNA capable of 
encoding differentially expressed gene protein sequences may be chemically synthesized using, 
for example, synthesizers. See, for example, the techniques described in "Oligonucleotide 
5 Synthesis", 1984, Gait, M. J. ed., IRL Press, Oxford, which is incorporated by reference herein in 
its entirety. 

A variety of host-expression vector systems may be utilized to express the HDAC9 gene 
coding sequences of the invention. Such host-expression systems represent vehicles by which the 
coding sequences of interest may be produced and subsequently purified, but also represent cells 

1 0 which may, when transformed or transfected with the appropriate nucleotide coding sequences, 
exhibit the HDAC9 gene protein of the invention in situ. These include but are not limited to 
microorganisms such as bacteria (e.g., E. coli, B. subtilis) transformed with recombinant 
bacteriophage DNA, plasmid DNA or cosmid DNA expression vectors containing differentially 
expressed gene protein coding sequences; yeast (e.g. Saccharomyces, Pichia) transformed with 

15 recombinant yeast expression vectors containing the differentially expressed gene protein coding 
sequences; insect cell systems infected or transfected with recombinant virus expression vectors 
(e.g., baculovirus) containing the differentially expressed gene protein coding sequences; plant 
cell systems infected with recombinant virus expression vectors (e.g., cauliflower mosaic virus, 
CaMV; tobacco mosaic virus, TMV) or transformed with recombinant vectors, including 

20 plasmids, (e.g., Ti plasmid) containing protein coding sequences; or mammalian cell systems 
(e.g. COS, CHO, BHK, 293, 3T3) harboring recombinant expression constructs containing 
promoters derived from the genome of mammalian cells (e.g., metallothioneine promoter) or 
from mammalian viruses (e.g., the adenovirus late promoter; the vaccinia virus 7.5K promoter, 
or the CMV promoter). 

25 Expression of the HDAC9 of the present invention by a cell from an HDAC9 encoding 

gene that is native to the cell can also be performed. Methods for such expression are detailed in, 
e.g., U.S. Patents 5,641,670; 5,733,761; 5,968,502; and 5,994,127, all of which are expressly 
incorporated by reference herein in their entirety. Cells that have been induced to express 
HDAC9 by the methods of any of U.S. Patents 5,641,670; 5,733,761; 5,968,502; and 5,994,127 
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can be implanted into a desired tissue in a living animal in order to increase the local 
concentration of HDAC9 in the tissue. 

In bacterial systems, a number of expression vectors may be advantageously selected 
depending upon the use intended for the protein being expressed. For example, when a large 
5 quantity of such a protein is to be produced, for the generation of antibodies or to screen peptide 
libraries, for example, vectors which direct the expression of high levels of fusion protein 
products that are readily purified may be desirable. In this respect, fusion proteins comprising 
hexahistidine tags may be used, such as EpiTag vectos including pCDNA3.1/His (Invitrogen, 
Carlsbad, CA). Other vectors include, but are not limited, to the E. coli expression vector 
10 pUR278 (Ruther et aL, 1983, EMBO J. 2: 1791), in which the protein coding sequence may be 
ligated individually into the vector in frame with the lac Z coding region so that a fusion protein 
is produced; pIN vectors (Inouye & Inouye, 1985, Nucleic Acids Res. 13:3101-3109; Van Heeke 
& Schuster, 1989, J. Biol. Chem. 264:5503-5509); and the like. pGEX vectors may also be used 
to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In 
15 general, such fusion proteins are soluble and can easily be purified from lysed cells by adsorption 
to glutathione-agarose beads followed by elution in the presence of free glutathione. The pGEX 
vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned 
target gene protein can be released from the GST moiety. Fusion proteins containing Flag tags, 
such as 3X Flag (Sigma, St. Louis, MO) or myc tags, for example pCT)NA3.1/myc-His 
20 (Invitrogen, Carlsbad, CA) may be used. These fusions allow coimmunoprecipitation and 
Western detection of proteins for which antibodies are not yet available. 

Promoter regions can be selected from any desired gene using vectors that contain a 
reporter transcription unit lacking a promoter region, such as a chloramphenicol acetyl 
transferase ("CAT"), or the luciferase transcription unit, downstream of restriction site or sites 
25 for introducing a candidate promoter fragment; i.e., a fragment that may contain a promoter. For 
example, introduction into the vector of a promoter-containing fragment at the restriction site 
upstream of the cat gene engenders production of CAT activity, which can be detected by 
standard CAT assays. Vectors suitable to this end are well known and readily available. Two 
such vectors are pKK232-8 andpCM7. Thus, promoters for expression of polynucleotides of the 
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present invention include not only well known and readily available promoters, but also 
promoters that readily may be obtained by the foregoing technique, using a reporter gene. 

Among known bacterial promoters suitable for expression of polynucleotides and 
polypeptides in accordance with the present invention are the E. coli lad and lacZ promoters, the 

5 T3 and T7 promoters, the T5 tac promoter, the lambda PR, PL promoters and the tip promoter. 
Among known eukaryotic promoters suitable in this regard are the CMV immediate early 
promoter, the HSV thymidine kinase promoter, the early and late SV40 promoters, the promoters 
of retroviral LTRs, such as those of the Rous sarcoma virus ("RSV"), and metallothionein 
promoters, such as the mouse metallothionein-I promoter. For example, a plasmid construct 

10 could contain a HDAC9 transcriptional control sequence fused to a reporter transcription unit 
that encodes the coding region of j3-Galactosidase, chloramphenicol acetyltransferase, green 
fluorescent protein or luciferase . This construct could be used to screen for small molecules that 
modulate HDAC9 transcription. Such molecules are potential therapeutics. Furthermore, an 
HDAC9 reporter gene could be used to examine the effects of an HDAC9 therapeutic in 

15 mammalian cells or xenografts using fluorescent reporters and imaging techniques, such as 

fluorescence microscopy or Biophotonic in vivo imaging, a technology that produces visual and 
quantitative measurements in real time (Xenogen, Palo Alto, CA). Changes in these reporters in 
normal, diseased or drug-treated tissue or cells would be indicators of changes in HDAC9 
expression or activity. 

20 

In an insect system, Autographa californica nuclear polyhedrosis virus (AcNPV) is one 
of several insect systems that can be used as a vector to express foreign genes. The virus grows 
in Spodoptera frugiperda cells. The coding sequence may be cloned individually into non- 
essential regions (for example the polyhedrin gene) of the virus and placed under control of an 

25 AcNPV promoter (for example the polyhedrin promoter). Successful insertion of the coding 
sequence will result in inactivation of the polyhedrin gene and production of non-occluded 
recombinant virus (i.e., virus lacking the proteinaceous coat coded for by the polyhedrin gene). 
These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the 
inserted gene is expressed (e.g., see Smith et al., 1983, J. Virol. 46: 584; Smith, U.S. Pat No. 

30 4,215,051). 
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In mammalian host cells, a number of viral-based expression systems may be utilized. In 
cases where an adenovirus is used as an expression vector, the coding sequence of interest may 
be ligated to an adenovirus transcription/translation control complex, e.g., the late promoter and 
tripartite leader sequence. This chimeric gene may then be inserted in the adenovirus genome by 
in vitro or in vivo recombination. Insertion in a non-essential region of the viral genome (e.g., 
region El or E3) will result in a recombinant virus that is viable and capable of expressing the 
desired protein in infected hosts (e.g., See Logan & Shenk, 1984, Proc. Natl. Acad. Sci. USA 
81 :3655-3659). Specific initiation signals may also be required for efficient translation of 
inserted gene coding sequences. These signals include the ATG initiation codon and adjacent 
sequences. In cases where an entire gene, including its own initiation codon and adjacent 
sequences, is inserted into the appropriate expression vector, no additional translational control 
signals may be needed. However, in cases where only a portion of the gene coding sequence is 
inserted, exogenous translational control signals, including, perhaps, the ATG initiation codon, 
must be provided. Furthermore, the initiation codon must be in phase with the reading frame of 
the desired coding sequence to ensure translation of the entire insert. These exogenous 
translational control signals and initiation codons can be of a variety of origins, both natural and 
synthetic. The efficiency of expression may be enhanced by the inclusion of appropriate 
transcription enhancer elements, transcription terminators, etc. (see Bittner et al., 1987, Methods 
in Enzymol. 153:5 16-544). Other common systems are based on SV40, retrovirus or adeno- 
associated virus. Selection of appropriate vectors and promoters for expression in a host cell is a 
well known procedure and the requisite techniques for expression vector construction, 
introduction of the vector into the host and expression in the host per se are routine skills in the 
art. Generally, recombinant expression vectors will include origins of replication, a promoter 
derived from a highly-expressed gene to direct transcription of a downstream structural 
sequence, and a selectable marker to permit isolation of vector containing cells after exposure to 
the vector. 

In addition, a host cell strain may be chosen which modulates the expression of the 
inserted sequences, or modifies and processes the gene product in the specific fashion desired. 
Such modifications (e.g., glycosylation) and processing (e.g., cleavage) of protein products may 
be important for the function of the protein. Different host cells have characteristic and specific 
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mechanisms for the post-translational processing and modification of proteins. Appropriate cell 
lines or host systems can be chosen to ensure the correct modification and processing of the 
foreign protein expressed. To this end, eukaryotic host cells which possess the cellular 
machinery for proper processing of the primary transcript, glycosylation, and phosphorylation of 
5 the gene product may be used. Such mammalian host cells include but are not limited to CHO, 
VERO, BHK, HeLa, COS, MDCK, 293, 3T3, WI38, etc. and are well known to one of skill in 
the art. 

For long-term, high-yield production of recombinant proteins, stable expression is 
preferred. For example, cell lines that stably express a differentially expressed protein product of 

1 0 a gene may be engineered. Rather than using expression vectors which contain viral origins of 
replication, host cells can be transformed with DNA controlled by appropriate expression control 
elements (e.g., promoter, enhancer, sequences, transcription terminators, polyadenylation sites, 
etc.), and a selectable marker. Following the introduction of the foreign DNA, engineered cells 
may be allowed to grow for 1-2 days in an enriched media, and then are switched to a selective 

15 media. The selectable marker in the recombinant plasmid confers resistance to the selection and 
allows cells to stably integrate the plasmid into their chromosomes and grow to form foci which 
in turn can be cloned and expanded into cell lines. This method may advantageously be used to 
engineer cell lines that express the differentially expressed gene protein. Such engineered cell 
lines may be particularly useful in screening and evaluation of compounds that affect the 

20 endogenous activity of the expressed protein. 

A number of selection systems may be used, including but not limited to, the herpes 
simplex virus thymidine kinase (Wigler, et al., 1977, Cell 11:223), hypoxanthine-guanine 
phosphoribosyltransferase (Szybalska & Szybalski, 1962, Proc. Natl. Acad. Sci. USA 48:2026), 
and adenine phosphoribosyltransferase (Lowy, et al., 1980, Cell 22:817) genes can be employed 

25 in tk\ hgprt" or aprf cells, respectively. Also, antimetabolite resistance can be used as the basis of 
selection for dhfr, which confers resistance to methotrexate (Wigler, et al., 1980, Natl. Acad. Sci. 
USA 77:3567; OHare, et al., 1981, Proc. Natl. Acad. Sci. USA 78:1527); gpt, which confers 
resistance to mycophenolic acid (Mulligan & Berg, 1981, Proc. Natl. Acad. Sci. USA 78:2072); 
neo, which confers resistance to the aminoglycoside G-418 (Colberre-Garapin, et al., 1981, J. 
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Mol. Biol. 150:1); and hygro, which confers resistance to hygromycin (Santerre, et al., 1984, 
Gene 30:147) genes. 

An alternative fusion protein system allows for the ready purification of non-denatured 
fusion proteins expressed in human cell lines (Janknecht, et al., 1991, Proc. Natl. Acad Sci. USA 

5 88: 8972-8976). In this system, the gene of interest is subcloned into a vaccinia recombination 
plasmid such that the gene's open reading frame is translationally fused to an ammo-terminal tag 
consisting of six histidine residues. Extracts from cells infected with recombinant vaccinia virus 
are loaded onto Ni 2+ nitriloacetic acid-agarose columns and histidine-tagged proteins are 
selectively eluted with imidazole-containing buffers. 

10 When used as a component in assay systems such as those described below, a protein of 

the present invention may be labeled, either directly or indirectly, to facilitate detection of a 
complex formed between the protein and a test substance. Any of a variety of suitable labeling 
systems may be used including, but not limited to, radioisotopes such as 125 I; enzyme labeling 
systems that generate a detectable calorimetric signal or light when exposed to substrate; and 

15 fluorescent labels. 

Where recombinant DNA technology is used to produce a protein of the present 
invention for such assay systems, it may be advantageous to engineer fusion proteins that can 
facilitate labeling, immobilization, detection and/or isolation 

Indirect labeling involves the use of a protein, such as a labeled antibody, which 

20 specifically binds to a polypeptide of the present invention. Such antibodies include but are not 
limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments and fragments 
produced by an Fab expression library. 

In another embodiment, nucleic acids comprising a sequence encoding HDAC9 protein 
25 or functional derivative thereof, may be administered to promote normal biological function, for 
example, normal transcriptional regulation, by way of gene therapy. Gene therapy refers to 
therapy performed by the administration of a nucleic acid to a subject. In this embodiment of the 
invention, the nucleic acid produces its encoded protein that mediates a therapeutic effect by 
promoting normal transcriptional regulation.. 
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Any of the methods for gene therapy available in the art can be used according to the 
present invention. Exemplary methods are described below. 

In a preferred aspect, the therapeutic comprises a HDAC9 nucleic acid that is part of an 
expression vector that expresses a HDAC9 protein or fragment or chimeric protein thereof in a 

5 suitable host In particular, such a nucleic acid has a promoter operably linked to the HDAC9 
coding region, said promoter being inducible or constitutive, and, optionally, tissue-specific. In 
another particular embodiment, a nucleic acid molecule is used in which the HDAC9 coding 
sequences and any other desired sequences are flanked by regions that promote homologous 
recombination at a desired site in the genome, thus providing for intrachromosomal expression of 

10 the HDAC9 nucleic acid (Koller and Smithies, 1989, Proc. Natl. Acad. Sci. USA 86:8932-8935; 
ZijlstraetaL, 1989, Nature 342:435-438). 

Delivery of the nucleic acid into a patient may be either direct, in which case the patient 
is directly exposed to the nucleic acid or nucleic acid-carrying vector, or indirect, in which case, 
cells are first transformed with the nucleic acid in vitro, then transplanted into the patient. These 

15 two approaches are known, respectively, as in vivo or ex vivo gene therapy. 

In a specific embodiment, the nucleic acid is directly administered in vivo 9 where it is 
expressed to produce the encoded product. This can be accomplished by any of numerous 
methods known in the art, e.g., by constructing it as part of an appropriate nucleic acid 
expression vector and administering it so that it becomes intracellular, e.g., by infection using a 

20 defective or attenuated retroviral or other viral vector (see, e.g., U.S. Pat. No. 4,980,286 and 
others mentioned infra), or by direct injection of naked DNA, or by use of microparticle 
bombardment (e.g., a gene gun; Biolistic, Dupont), or coating with lipids or cell-surface 
receptors or transfecting agents, encapsulation in liposomes, microparticles, or microcapsules, or 
by administering it in linkage to a peptide which is known to enter the nucleus, by administering 

25 it in linkage to a ligand subject to receptor-mediated endocytosis (see e.g., U.S. Patents 

5,166,320; 5,728,399; 5,874,297; and 6,030,954, all of which are incorporated by reference 
herein in their entirety) (which can be used to target cell types specifically expressing the 
receptors), etc. In another embodiment, a nucleic acid-ligand complex can be formed in which 
the ligand comprises a fusogenic viral peptide to disrupt endosomes, allowing the nucleic acid to 

30 avoid lysosomal degradation. In yet another embodiment, the nucleic acid can be targeted in vivo 
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for cell specific uptake and expression, by targeting a specific receptor (see, e.g., PCT 
Publications WO 92/06180; WO 92/22635; WO92/20316; W093/14188; and WO 93/20221). 
Alternatively, the nucleic acid can be introduced intracellularly and incorporated within host cell 
DNA for expression, by homologous recombination (see, e.g., U.S. Patents 5,413,923; 

5 5,416,260; and 5,574,205; and Zijlstra et al., 1989, Nature 342:435-438). 

In a specific embodiment, a viral vector that contains the HDAC9 nucleic acid is used. 
For example, a retroviral vector can be used (see, e.g., U.S. Patents 5,219,740; 5,604,090; and 
5,834,182). These retroviral vectors have been modified to delete retroviral sequences that are 
not necessary for packaging of the viral genome and integration into host cell DNA. The HDAC9 

10 nucleic acid to be used in gene therapy is cloned into the vector, which facilitates delivery of the 
gene into a patient 

Adenoviruses are other viral vectors that can be used in gene therapy. Adenoviruses are 
especially attractive vehicles for delivering genes to respiratory epithelia. Adenoviruses naturally 
infect respiratory epithelia where they cause a mild disease. Other targets for adenovirus-based 

15 delivery systems are liver, the central nervous system, endothelial cells, and muscle. 

Adenoviruses have the advantage of being capable of infecting non-dividing cells. Methods for 
conducting adenovirus-based gene therapy are described in, e.g., U.S. Patents 5,824,544; 
5,868,040; 5,871,722; 5,880,102; 5,882,877; 5,885,808; 5,932,210; 5,981,225; 5,994,106; 
5,994,132; 5,994,134; 6,001,557; and 6,033,8843, all of which are incorporated by reference 

20 herein in their entirety. 

Adeno-associated virus (AAV) has also been proposed for use in gene therapy. Methods 
for producing and utilizing AAV are described, e.g., in U.S. Patents 5,173,414; 5,252,479; 
5,552,311; 5,658,785; 5,763,416; 5,773,289; 5,843,742; 5,869,040; 5,942,496; and 5,948,675, all 
of which are incorporated by reference herein in their entirety. 

25 Another approach to gene therapy involves transferring a gene to cells in tissue culture by 

such methods as electroporation, lipofection, calcium phosphate mediated transfection, or viral 
infection. Usually, the method of transfer includes the transfer of a selectable marker to the cells. 
The cells are then placed under selection to isolate those cells that have taken up and are 
expressing the transferred gene. Those cells are then delivered to a patient. 
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In this embodiment, the nucleic acid is introduced into a cell prior to administration in 
vivo of the resulting recombinant cell. Such introduction can be carried out by any method 
known in the art, including but not limited to transfection, electroporation, microinjection, 
infection with a viral or bacteriophage vector containing the nucleic acid sequences, cell fusion, 

5 chromosome-mediated gene transfer, microcell-mediated gene transfer, spheroplast fusion, etc. 
Numerous techniques are known in the art for the introduction of foreign genes into cells and 
may be vised in accordance with the present invention, provided that the necessary developmental 
and physiological functions of the recipient cells are not disrupted. The technique should provide 
for the stable transfer of the nucleic acid to the cell, so that the nucleic acid is expressible by the 

10 cell and preferably heritable and expressible by its cell progeny. 

The resulting recombinant cells can be delivered to a patient by various methods known 
in the art hi a preferred embodiment, epithelial cells are injected, e.g., subcutaneously. In 
another embodiment, recombinant skin cells may be applied as a skin graft onto the patient. 
Recombinant blood cells (e.g., hematopoietic stem or progenitor cells) are preferably 

15 administered intravenously. The amount of cells envisioned for use depends on the desired 
effect, patient state, etc., and can be determined by one skilled in the art. 

Cells into which a nucleic acid can be introduced for purposes of gene therapy encompass 
any desired, available cell type, and include but are not limited to epithelial cells, endothelial 
cells, keratinocytes, fibroblasts, muscle cells, hepatocytes; blood cells such as T lymphocytes, B 

20 lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes; 
various stem or progenitor cells, in particular hematopoietic stem or progenitor cells, e.g., as 
obtained from bone marrow, umbilical cord blood, peripheral blood, fetal liver, etc. 

In a preferred embodiment, the cell used for gene therapy is autologous to the patient. 
In an embodiment in which recombinant cells are used in gene therapy, a HDAC9 nucleic 

25 acid is introduced into the cells such that it is expressible by the cells or their progeny, and the 
recombinant cells are then administered in vivo for therapeutic effect. In a specific embodiment, 
stem or progenitor cells are used. Any stem-and/or progenitor cells that can be isolated and 
maintained in vitro can potentially be used in accordance with this embodiment of the present 
invention. Such stem cells include but are not limited to hematopoietic stem cells (HSC), stem 

30 cells of epithelial tissues such as the skin and the lining of the gut, embryonic heart muscle cells, 
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liver stem cells (see, e.g., WO 94/08598), and neural stem cells (Stemple and Anderson, 1992, 
Cell 71:973-985). 

Epithelial stem cells (ESCs) or keratinocytes can be obtained from tissues such as the 
skin and the lining of the gut by known procedures (Rheinwald, 1980, Meth. Cell Bio. 21A:229). 
In stratified epithelial tissue such as the skin, renewal occurs by mitosis of stem cells within the 
germinal layer, the layer closest to the basal lamina. Stem cells within the lining of the gut 
provide for a rapid renewal rate of this tissue. ESCs or keratinocytes obtained from the skin or 
lining of the gut of a patient or donor can be grown in tissue culture (Pittelkow and Scott, 1986, 
Mayo Clinic Proc. 61:771). If the ESCs are provided by a donor, a method for suppression of 
host versus graft reactivity (e.g., irradiation, drug or antibody administration to promote 
moderate immunosuppression) can also be used. 

With respect to hematopoietic stem cells (HSC), any technique which provides for the 
isolation, propagation, and maintenance in vitro of HSC can be used in this embodiment of the 
invention. Techniques by which this may be accomplished include (a) the isolation and 
establishment of HSC cultures from bone marrow cells isolated from the future host, or a donor, 
or (b) the use of previously established long-term HSC cultures, which may be allogeneic or 
xenogeneic. Non-autologous HSC are used preferably in conjunction with a method of 
suppressing transplantation immune reactions of the future host/patient. In a particular 
embodiment of the present invention, human bone marrow cells can be obtained from the 
posterior iliac crest by needle aspiration (see, e.g., Kodo et al., 1984, J. Clin. Invest. 73:1377- 
1384). hi a preferred embodiment of the present invention, the HSCs can be made highly 
enriched or in substantially pure form. This enrichment can be accomplished before, during, or 
after long-term culturing, and can be done by any techniques known in the art. Long-term 
cultures of bone marrow cells can be established and maintained by using, for example, modified 
Dexter cell culture techniques (Dexter et al., 1977, J. Cell Physiol. 91:335) or Witlock-Witte 
culture techniques (Witlock and Witte, 1982, Proc. Natl. Acad. Sci. USA 79:3608-3612). 

In a specific embodiment, the nucleic acid to be introduced for purposes of gene therapy 
comprises an inducible promoter operably linked to the coding region, such that expression of 
the nucleic acid is controllable by controlling the presence or absence of the appropriate inducer 
of transcription. 
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A further embodiment of the present invention relates to a purified antibody or a 
fragment thereof which specifically binds to a polypeptide that comprises the amino acid 
sequence set forth in SEQ ID NOs: 1, 5 or 6 or to a fragment of said polypeptide. A preferred 

5 embodiment relates to a fragment of such an antibody, which fragment is an Fab or Ffcb 1 ^ 
fragment In particular, the antibody can be a polyclonal antibody or a monoclonal antibody. 

Described herein are methods for the production of antibodies capable of specifically 
recognizing one or more differentially expressed gene epitopes. Such antibodies may include, but 
are not limited to polyclonal antibodies, monoclonal antibodies (mAbs), humanized or chimeric 

10 antibodies, single chain antibodies, Fab fragments, Ffcb^ fragments, fragments produced by a 
Fab expression library, anti-idiotypic (anti-Id) antibodies, and epitope-binding fragments of any 
of the above. Such antibodies may be used, for example, in the detection of a fingerprint, target, 
gene in a biological sample, or, alternatively, as a method for the inhibition of abnormal target 
gene activity. Thus, such antibodies may be utilized as part of disease treatment methods, and/or 

15 may be used as part of diagnostic techniques whereby patients may be tested for abnormal levels 
of the HDAC9 polypeptide, or for the presence of abnormal forms of the HDAC9 polypeptide. 

For the production of antibodies to the HDAC9 polypeptide, various host animals may be 
immunized by injection with the HDAC9 polypeptide, or a portion thereof. Such host animals 
may include but are not limited to rabbits, mice, and rats, to name but a few. Various adjuvants 

20 may be used to increase the immunological response, depending on the host species, including 
but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, 
surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil 
emulsions, keyhole limpet hemocyanin, dinitrophenol, and potentially useful human adjuvants 
such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. 

25 Polyclonal antibodies are heterogeneous populations of antibody molecules derived from 

the sera of animals immunized with an antigen, such as target gene product, or an antigenic 
functional derivative thereof. For the production of polyclonal antibodies, host animals such as 
those described above, may be immunized by injection with the HDAC9 polypeptide, or a 
portion thereof, supplemented with adjuvants as also described above. 
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Monoclonal antibodies, which are homogeneous populations of antibodies to a particular 
antigen, may be obtained by any technique which provides for the production of antibody 
molecules by continuous cell lines in culture. These include, but are not limited to the hybridoma 
technique of Kohler and Milstein, (1975, Nature 256:495-497; and U.S. Pat. No. 4,376,1 10), the 

5 human B-cell hybridoma technique (Kosbor et al., 1983, Immunology Today 4:72; Cole et al., 
1983, Proc. Natl. Acad. Sci. USA 80:2026-2030), and the EBV-hybridoma technique (Cole et 
al., 1985, Monoclonal Antibodies And Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). Such 
antibodies may be of any immunoglobulin class including IgG, IgM, IgE, IgA, IgD and any 
subclass thereof. The hybridoma producing the mAb of this invention may be cultivated in vitro 

10 or in vivo. Production of high titers of mAbs in vivo makes this the presently preferred method 
of production. 

In addition, techniques developed for the production of "chimeric antibodies" (Morrison 
et al., 1984, Proc. Natl. Acad. Sci., 81:6851-6855; Neuberger et al., 1984, Nature, 312:604-608; 
Takeda et al., 1985, Nature, 314:452-454) by splicing the genes from a mouse antibody molecule 

1 5 of appropriate antigen specificity together with genes from a human antibody molecule of 

appropriate biological activity can be used. A chimeric antibody is a molecule in which different 
portions are derived from different animal species, such as those having a variable or 
hypervariable region derived from a murine mAb and a human immunoglobulin constant region. 
Alternatively, techniques described for the production of single chain antibodies (U.S. 

20 Pat. No. 4,946,778; Bird, 1988, Science 242:423-426; Huston et al., 1988, Proc. Natl. Acad. Sci. 
USA 85:5879-5883; and Ward et al., 1989, Nature 334:544-546) can be adapted to produce 
differentially expressed gene-single chain antibodies. Single chain antibodies are formed by 
linking the heavy and light chain fragments of the Fv region via an amino acid bridge, resulting 
in a single chain polypeptide. 

25 Most preferably, techniques useful for the production of "humanized antibodies" can be 

adapted to produce antibodies to the polypeptides, fragments, derivatives, and functional 
equivalents disclosed herein. Such techniques are disclosed in U.S. Patent Nos. 5,932, 448; 
5,693,762; 5,693,761; 5,585,089; 5,530,101; 5,910,771; 5,569,825; 5,625,126; 5,633,425; 
5,789,650; 5,545,580; 5,661,016; and 5,770,429, the disclosures of all of which are incorporated 

30 by reference herein in their entirety. 
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Antibody fragments that recognize specific epitopes may be generated by known 
techniques. For example, such fragments include but are not limited to: the Ffab 1 ^ fragments 
which can be produced by pepsin digestion of the antibody molecule and the Fab fragments 
which can be generated by reducing the disulfide bridges of the F(ab , ) 2 fragments. Alternatively, 
5 Fab expression libraries maybe constructed (Huse et al., 1989, Science, 246:1275-1281) to allow 
rapid and easy identification of monoclonal Fab fragments with the desired specificity. 

An antibody of the present invention can be preferably used in a method for the diagnosis 
of a condition associated with abnormal HDAC9 expression or activity, for example, abnormal 
cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or 

10 immune response, or psoriasis, in a human which comprises: measuring the amount of a 

polypeptide comprising the amino acid sequence set forth in SEQ ID NOs:l, 5 or 6, or fragments 
thereof, in an appropriate tissue or cell from a human suffering from a condition associated with 
abnormal HDAC9 activity, wherein the presence of an elevated amount of said polypeptide or 
fragments thereof, relative to the amount of said polypeptide or fragments thereof in the 

15 respective tissue from a human not suffering from a condition associated with abnormal HDAC9 
activity is diagnostic of said human's suffering from such condition. Such a method forms a 
further embodiment of the present invention. Preferably, said detecting step comprises contacting 
said appropriate tissue or cell with an antibody which specifically binds to a polypeptide that 
comprises the amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 or a fragment thereof and 

20 detecting specific binding of said antibody with a polypeptide in said appropriate tissue or cell^ 
wherein detection of specific binding to a polypeptide indicates the presence of a polypeptide 
that comprises the amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 or a fragment thereof. 

Particularly preferred, for ease of detection, is the sandwich assay, of which a number of 
variations exist, all of which are intended to be encompassed by the present invention. 

25 For example, in a typical forward assay, unlabeled antibody is immobilized on a solid 

substrate and the sample to be tested brought into contact with the bound molecule. After a 
suitable period of incubation, for a period of time sufBcient to allow formation of an antibody- 
antigen binary complex. At this point, a second antibody, labeled with a reporter molecule 
capable of inducing a detectable signal, is then added and incubated, allowing time sufficient for 

30 the formation of a ternary complex of antibody-antigen-labeled antibody. Any unreacted material 
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is washed away, and the presence of the antigen is determined by observation of a signal, or may 
be quantitated by comparing with a control sample containing known amounts of antigen. 
Variations on the forward assay include the simultaneous assay, in which both sample and 
antibody are added simultaneously to die bound antibody, or a reverse assay in which the labeled 
5 antibody and sample to be tested are first combined, incubated and added to the unlabeled 
surface bound antibody. These techniques are well known to those skilled in the art, and die 
possibility of minor variations will be readily apparent As used herein, "sandwich assay" is 
intended to encompass all variations on the basic two-site technique. For the immunoassays of 
the present invention, the only limiting factor is that the labeled antibody be an antibody which is 
10 specific for the HDAC9 polypeptide or a fragment thereof. 

The most commonly used reporter molecules in this type of assay are either enzymes, 
fluorophore- or radionuclide-containing molecules. In the case of an enzyme immunoassay an 
enzyme is conjugated to the second antibody, usually by means of glutaraldehyde or periodate. 
As will be readily recognized, however, a wide variety of different ligation techniques exist, 
15 which are well-known to the skilled artisan. Commonly used enzymes include horseradish 
peroxidase, glucose oxidase, beta-galactosidase and alkaline phosphatase, among others. The 
substrates to be used with the specific enzymes are generally chosen for the production, upon , 
hydrolysis by the corresponding enzyme, of a detectable color change. For example, p- 
nitrophenyl phosphate is suitable for use with alkaline phosphatase conjugates; for peroxidase. 
20 conjugates, 1,2-phenylenediamine or toluidine are commonly used. It is also possible to employ 
fluorogenic substrates, which yield a fluorescent product rather than the chromogenic substrates 
noted above. A solution containing the appropriate substrate is then added to the tertiary 
complex. The substrate reacts with the enzyme linked to the second antibody, giving a qualitative 
visual signal, which may be further quantitated, usually spectrophotometrically, to give an 
25 evaluation of the amount of HDAC9 which is present in the serum sample. 

Alternately, fluorescent compounds, such as fluorescein and rhodamine, may be 
chemically coupled to antibodies without altering their binding capacity. When activated by 
illumination with light of a particular wavelength, the fluorochrome-labeled antibody absorbs the 
light energy, inducing a state of excitability in die molecule, followed by emission of die light at 
30 a characteristic longer wavelength. The emission appears as a characteristic color visually 
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detectable with a light microscope. Immunofluorescence and EIA techniques are both very well 
established in the art and are particularly preferred for the present method However, other 
reporter molecules, such as radioisotopes, chemiluminescent or bioluminescent molecules may 
also be employed. It will be readily apparent to the skilled artisan how to vary the procedure to 
suit the required use. 

This invention also relates to the use of polynucleotides of the present invention as 
diagnostic reagents. In particular, the invention relates to a method for the diagnosis of a 
condition associated with abnormal HDAC9 expression or activity, for example, abnormal cell 
proliferation, cancer, atherosclerosis, inflammatory bowel disease, host inflammatory or immune 
response, or psoriasis in a human which comprisesrdetecting elevated transcription of messenger 
RNA transcribed from the natural endogeneous human gene encoding the polypeptide consisting 
of an amino acid sequence set forth in SEQ ID NOs:l, 5 or 6 in an appropriate tissue or cell 
from a human, wherein said elevated transcription is diagnostic of said human's suffering from 
the condition associated with abnormal HDAC9 expression or activity. In particular, said natural 
endogeneous human gene comprises the nucleotide sequence set forth in SEQ ID NO:4. 7 or 8. 
In a preferred embodiment such a method comprises contacting a sample of said appropriate 
tissue or cell or contacting an isolated RNA or DNA molecule derived from that tissue or cell 
with an isolated nucleotide sequence of at least about 20 nucleotides in length that hybridizes 
under high stringency conditions with the isolated nucleotide sequence encoding a polypeptide 
consisting of an amino acid sequence set forth in SEQ ID NOs:l, 5 or 6. 

Detection of a mutated form of the gene characterized by the polynucleotide of SEQ ID 
NO:4 7 or 8 which is associated with a dysfunction will provide a diagnostic tool that can add 
to, or define, a diagnosis of a disease, or susceptibility to a disease, which results from under- 
expression, over-expression or altered spatial or temporal expression of the gene. Individuals 
carrying mutations in the gene may be detected at the DNA level by a variety of techniques. 

Nucleic acids, in particular mRNA, for diagnosis may be obtained from a subject's cells, 
such as from blood, urine, saliva, tissue biopsy or autopsy material. The genomic DNA may be 
used directly for detection or may be amplified enzymatically by using PCR or other 
amplification techniques prior to analysis. RNA or cDNA may also be used in similar fashion. 
Deletions and insertions can be detected by a change in size of the amplified product in 
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comparison to the normal genotype. Point mutations can be identified by hybridizing amplified 
DNA to labeled nucleotide sequences encoding the HDAC9 polypeptide of the present invention. 
Perfectly matched sequences can be distinguished from mismatched duplexes by RNase 
digestion or by differences in melting temperatures. DNA sequence differences may also be 
detected by alterations in electrophoretic mobility of DNA fragments in gels, with or without 
denaturing agents, or by direct DNA sequencing (e.g., Myers et al., Science (1985) 230:1242). 
Sequence changes at specific locations may also be revealed by nuclease protection assays, such 
as RNase and SI protection or the chemical cleavage method (see Cotton et al., Proc Natl Acad 
Sci USA (1985) 85: 4397-4401). hi another embodiment, an array of oligonucleotides probes 
comprising nucleotide sequence encoding the HDAC9 polypeptide of the present invention or 
fragments of such a nucleotide seqeunce can be constructed to conduct efficient screening of 
e.g., genetic mutations. Array technology methods are well known and have general applicability 
and can be used to address a variety of questions in molecular genetics including gene 
expression, genetic linkage, and genetic variability (see for example: M. Chee et al., Science, Vol 
274, pp 610-613 (1996)). 

The diagnostic assays offer a process for diagnosing or determining a susceptibility to 
disease through detection of mutation in the HDAC9 gene by the methods described. In addition, 
such diseases may be diagnosed by methods comprising determining from a sample derived from 
a subject an abnormally decreased or increased level of polypeptide or mRNA. Decreased or 
increased expression can be measured at the RNA level using any of the methods well known in 
the art for the quantitation of polynucleotides, such as, for example, nucleic acid amplification, 
for instance PCR, RT-PCR, RNase protection, Northern blotting and other hybridization 
methods. Assay techniques that can be used to determine levels of a protein, such as a 
polypeptide of the present invention, in a sample derived from a host are well-known to those of 
skill in the art Such assay methods include radioimmunoassays, competitive-binding assays, 
Western Blot analysis and ELISA assays. 

Thus in another aspect, the present invention relates to a diagnostic kit which comprises: 

(a) a polynucleotide of the present invention, preferably the nucleotide sequence of SEQ 
ID NO:2, 3, 4, 7 or 8 or a fragment thereof; 

(b) a nucleotide sequence complementary to that of (a); 
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(c) a polypeptide of the present invention, preferably the polypeptide of SEQ ID NOs: 1 , 5 
or 6 or a fragment thereof; or 

(d) an antibody to a polypeptide of the present invention, preferably to ihe polypeptide of 

SEQIDNOs:!, 5 or 6. 

It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial 
component. Such a kit will be of use in diagnosing a disease or susceptibility to a disease, 
particularly to a disease or condition associated with abnormal HDAC9 expression or activity, 
for example, abnormal cell proliferation, cancer, atherosclerosis, inflammatory bowel disease, 
host inflammatory or immune response, or psoriasis. 

The nucleotide sequences of the present invention are also valuable for chromosome 
localization. The sequence is specifically targeted to, and can hybridize with, a particular 
location on an individual human chromosome. The mapping of relevant sequences to 
chromosomes according to the present invention is an important first step in correlating those 
sequences with gene associated disease. Once a sequence has been mapped to a precise 
chromosomal location, the physical position of the sequence on the chromosome can be 
correlated with genetic map data. Such data are found in, for example, V. McKusick, Mendelian 
Inheritance in Man (available on-line through Johns Hopkins University Welch Medical 
Library). The relationship between genes and diseases that have been mapped to the same 
chromosomal region are then identified through linkage analysis (coinheritance of physically 
adjacent genes). 

The differences in the cDNA or genomic sequence between affected and unaffected 
individuals can also be determined. If a mutation is observed in some or all of the affected 
individuals but not in any normal individuals, then the mutation is likely to be the causative 
agent of the disease. 

An additional embodiment of the invention relates to the administration of a 
pharmaceutical composition, in conjunction with a pharmaceutical^ acceptable carrier, excipient 
or diluent, for any of the therapeutic effects discussed above. Such pharmaceutical compositions 
may consist of HDAC9, antibodies to that polypeptide, mimetics, agonists, antagonists, or 
inhibitors of HDAC9 function. The compositions maybe administered alone or in combination 
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with at least one other agent, such as stabilizing compound, which may be administered in any 
sterile, biocompatible pharmaceutical carrier, including, but not limited to, saline, buffered 
saline, dextrose, and water. The compositions may be administered to a patient alone, or in 
combination with other agents, drugs or hormones. 

In addition, any of the therapeutic proteins, antagonists, antibodies, agonists, antisense 
sequences or vectors described above may be administered in combination with other appropriate 
therapeutic agents. Selection of the appropriate agents for use in combination therapy may be 
made by one of ordinary skill in the art, according to conventional pharmaceutical principles. 
The combination of therapeutic agents may act synergistically to effect the treatment or 
prevention of the various disorders described above. Using this approach, one may be able to 
achieve therapeutic efficacy with lower dosages of each agent, thus reducing the potential for 
adverse side effects. Antagonists and agonists of HDAC9 may be made using methods which 
are generally known in the art. 

The pharmaceutical compositions encompassed by the invention may be administered by 
any number of routes including, but not limited to, oral, intravenous, intramuscular, intra- 
articular, intra-arterial, intramedullary, intrathecal, intraventricular, transdermal, subcutaneous, 
intraperitoneal, intranasal, enteral, topical, sublingual, or rectal means. 

In addition to the active ingredients, these pharmaceutical compositions may contain 
suitable pharmaceutically-acceptable carriers comprising excipients and auxiliaries which 
facilitate processing of the active compounds into preparations which can be used 
pharmaceutically. Further details on techniques for formulation and administration may be found 
in the latest edition of Remington's Pharmaceutical Sciences (Maack Publishing Co., Easton, 
Pa.). 

Pharmaceutical compositions for oral administration can be formulated using 
pharmaceutically acceptable carriers well known in the art in dosages suitable for oral 
administration. Such carriers enable the pharmaceutical compositions to be formulated as tablets, 
pills, dragees, capsules, liquids, gels, syrups, slurries, suspensions, and the like, for ingestion by 
the patient. 

Pharmaceutical preparations for oral use can be obtained through combination of active 
compounds with solid excipient, optionally grinding a resulting mixture, and processing the 
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mixture of granules, after adding suitable auxiliaries, if desired, to obtain tablets or dragee cores. 
Suitable excipients are carbohydrate or protein fillers, such as sugars, including lactose, sucrose, 
mannitol, or sorbitol; starch from corn, wheat, rice, potato, or other plants; cellulose, such as 
methyl cellulose, hydroxypropylmethyl-cellulose, or sodium carboxymethylcellulose; gums 
5 including arabic and tragacanth; and proteins such as gelatin and collagen. If desired, 
disintegrating or solubilizing agents may be added, such as the cross-linked polyvinyl 
pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate. 

Dragee cores may be used in conjunction with suitable coatings, such as concentrated 
sugar solutions, which may also contain gum arabic, talc, polyvinylpyrrolidone, caibopol gel, 
10 polyethylene glycol, and/or titanium dioxide, lacquer solutions, and suitable organic solvents or 
solvent mixtures. Dyestuffs or pigments may be added to the tablets or dragee coatings for 
product identification or to characterize the quantity of active compound, i.e., dosage. 

Pharmaceutical preparations which can be used orally include push-fit capsules made of 
gelatin, as well as soft, sealed capsules made of gelatin and a coating, such as glycerol or 
15 sorbitol. Push-fit capsules can contain active ingredients mixed with a filler or binders, such as 
lactose or starches, lubricants, such as talc or magnesium stearate, and, optionally, stabilizers. In 
soft capsules, the active compounds may be dissolved or suspended in suitable liquids, such as 
fatty oils, liquid, or liquid polyethylene glycol with or without stabilizers. 

Pharmaceutical formulations suitable for parenteral administration may be formulated m 
20 aqueous solutions, preferably in physiologically compatible buffers such as Hanks' solution, 

Ringer's solution, or physiologically buffered saline. Aqueous injection suspensions may contain 
substances which increase the viscosity of the suspension, such as sodium carboxymethyl 
cellulose, sorbitol, or dextran. Additionally, suspensions of the active compounds may be 
prepared as appropriate oily injection suspensions. Suitable lipophilic solvents or vehicles 
25 include fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl oleate or 
triglycerides, or liposomes. Non-lipid polycationic amino polymers may also be used for 
delivery. Optionally, the suspension may also contain suitable stabilizers or agents which 
increase the solubility of the compounds to allow for the preparation of highly concentrated 
solutions. 
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For topical or nasal administration, penetrants appropriate to the particular barrier to be 
permeated are used in the formulation. Such penetrants are generally known in the art 

The pharmaceutical compositions of the present invention may be manufactured in a 
manner that is known in the art, e.g., by means of conventional mixing, dissolving, granulating, 
dragee-making, levigating, emulsifying, encapsulating, entrapping, or lyophilizing processes. 

The pharmaceutical composition may be provided as a salt and can be formed with many 
acids, including but not limited to, hydrochloric, sulfuric, acetic, lactic, tartaric, malic, succinic, 
etc. Salts tend to be more soluble in aqueous or other protonic solvents than are the 
corresponding free base forms. In other cases, the preferred preparation may be a lyophilized 
powder which may contain any or all of the following: 1-50 mM histidine, 0. l%-2% sucrose, 
and 2-7% mannitol, at a pH range of 4.5 to 5.5, that is combined with buffer prior to use. 

After pharmaceutical compositions have been prepared, they can be placed in an 
appropriate container and labeled for treatment of an indicated condition. For administration of 
the HDAC9, such labeling would include amount, frequency, and method of administration. 

Pharmaceutical compositions suitable for use in the invention include compositions 
wherein the active ingredients are contained in an effective amount to achieve the intended 
purpose. The determination of an effective dose is well within the capability of those skilled in 
the art 

For any compound, the therapeutically effective dose can be estimated initially either in 
cell culture assays, e.g., of neoplastic cells, or in animal models, usually mice, rabbits, dogs, or 
pigs. The animal model may also be used to determine the appropriate concentration range and 
route of administration. Such information can then be used to determine useful doses and routes 
for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example 
HDAC9 or fragments thereof, antibodies of HDAC9, agonists, antagonists or inhibitors of 
HDAC9, which ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be 
determined by standard pharmaceutical procedures in cell cultures or experimental animals, e.g., 
ED50 (the dose therapeutically effective in 50% of the population) and LD50 (the dose lethal to 
50% of the population). The dose ratio between toxic and therapeutic effects is the therapeutic 
index, and it can be expressed as the ratio, LD50/ED50. Pharmaceutical compositions which 
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exhibit large therapeutic indices are preferred. The data obtained from cell culture assays and 
animal studies is used in formulating a range of dosage for human use. The dosage contained in 
such compositions is preferably within a range of circulating concentrations that include the 
ED50 with little or no toxicity. The dosage varies within this range depending upon the dosage 
5 form employed, sensitivity of the patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 
subject that requires treatment. Dosage and administration are adjusted to provide sufficient 
levels of the active moiety or to maintain the desired effect. Factors which may be taken into 
account include the severity of the disease state, general health of the subject, age, weight, and 
10 gender of the subject, diet, time and frequency of administration, drug combination^), reaction 
sensitivities, and tolerance/response to therapy. Long-acting pharmaceutical compositions may 
be administered every 3 to 4 days, every week, or once every two weeks depending on half-life 
and clearance rate of the particular formulation. 

Normal dosage amounts may vary from 0.1 to 100,000 micrograms, up to a total dose of 
1 5 about 1 g, depending upon the route of administration. Guidance as to particular dosages and 

methods of delivery is provided in the literature and generally available to practitioners in the art. , 
Those skilled in the art will employ different formulations for nucleotides than for proteins or 
their inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to 
particular cells, conditions, locations, etc. Pharmaceutical formulations suitable for oral 
20 administration of proteins are described, e.g., in U.S. Patents 5,008,1 14; 5,505,962; 5,641,515; 
5,681,81 1; 5,700,486; 5,766,633; 5,792,451; 5,853,748; 5,972,387; 5,976,569; and 6,051,561. 

The following Examples illustrate the present invention, without in any way limiting the 
scope thereof. 

25 

Examples 

Exam ple 1: Identification of a novel HP AC related human DNA sequence using bioinformatics 
HDAC9 was identified using computer software for the identification of new members of gene 
families based on a strategy to find maximal evolutionary links among known HDAC family 
30 members by first searching the non-redundant amino acid database, followed by searching less 
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diverse databases such as the Celera Human Genome Database (CHGD), public High 
Throughput Genomic (HTG) database and the Incyte LIFESEQ™ database. Smith- Waterman 
(Pearson W. R. Comparison of methods for searching protein sequence databases. Protein Sci 
(1995) 4,1 145-60) and Hidden Markov Models (probability models derived from diversity of 
amino acids at every position (Eddy S. R. Hidden Markov models. Curr Opin Struct Biol (1996) 
6, 361-5) were performed. An 1 156 bp open reading frame (ORF) was identified and used to 
search a database of sequenced clones from pan-tissue and dorsal root ganglion cDNA libraries. 

Example 2: Construction of nan-tissue and dorsal root ganglion cDNA libraries 
Pan-tissue and dorsal root ganglion cDNA libraries are prepared from polyA+ RNA. Total RNA 
was extracted from a pooled sample of 3 1 human tissues or dorsal root ganglia and isolated using 
TRIZOL reagent according to manufacturer's instructions (Life Technologies, Rockville, MD). 
mRNA is isolated using Polytract mRNA Isolation System HI according to manufacturer's 
instructions (Promega, Madison, WI). Total RNA is hybridized to a biotinylated-oligo (dt) probe. 
The oligo (dt)-mRNA hybrids are captured on streptavidin magnesphere particles and eluted in 
Rnase-free H 2 0. 3 ul of biotinylated-oligo(dt) probe (50 pmol/ul) and 13 ul of 20X SSC is added 
to 60-150 ug of RNA that is heated to 65°C in RNase free water. This mixture is incubated at 
room temperature until it is completely cooled. Streptavidin-paramagnetic particles (beads) are 
resuspended and washed 3 times in 0.5X SSC and then resuspended in 0.5X SSC. The RNA- 
oligo(dt) hybrids from the previous step are added to these beads. To release the poly-A RNA 
from the beads, the beads are resuspended in Rnase-free water and magnetically captured and 
then the eluate from the beads is ethanol precipitated. First and second strand cDNA synthesis is 
performed using a modified procedure from Life Technologies (D'Alessio, J. M., Gruber, C.E., 
Cain, C. R., and Noon, M. C. (1990) Focus 12, 47). First strand synthesis is performed by 
incubating 1-5 ug of RNA that is heated to 60°C in IX 1 st strand buffer (Life Technologies)/6 
mM DTT/600 nM dNTPs/2 units anti-Rnase. This mixture is incubated at 40°C for2 min, then 
Superscript II reverse transcriptase (RT) and 1 ul of Display Thermo RT terminator mix is added 
and the mixture is incubated at 40°C for 1 h, followed by incubation at 60°C for 10 min. Second 
strand synthesis is performed in lx second strand buffer (Life Technologies) in DEPC-H 2 0/66 
nM/1 ul E.coli DNA ligase/4 ul E. coli DNA polymerase 1/1 ul E. coli Rnase H. This mixture is 
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incubated at 10°C for 10 min and then at 16°C for 2h. To this mixture, 2 ul of T4 DNA 
polymerase is added and incubation is continued at 16°C for 5 min. The reaction is stopped with 
10 ul of 0.5M EDTA, extracted with phenol/chlorofonn/isoamyl alcohol and then ethanol 
precipitated. Sal I and Not I adaptors are added to the 5' ends of the cDNAs by ligation for 

5 directional cloning using conventional methodology. The cDNAs are then passed through a size 
fractionation column to retrieve cDNAs that are >500 bp in length according to manufacturers 
instructions (Life Technologies, Rockville, MD). cDNAs are ligated to Sal I/Not I digested 
Gateway compatible pCMV-Sport6 vector (Life Technologies, Rockville, MD) using 
conventional methods. Competent DH10B cells (Life Technologies, Rockville, MD) are 

10 transformed with the resulting library using conventional methods. Semi-solid amplification of 
the libraries is performed according to the manufacturer's instructions (Life Technologies, 
Rockville, MD). 

Example 3: Preparation of full length cDNA encoding the novel HDAC9 consisting of SEQ ID 

15 NO:l. 5 or 6: A n 1 156 base pair ORF was used to search a database of sequenced clones from 
pan-tissue and dorsal root ganglion cDNA libraries using BLAST. Four clones were found to 
contain the ORF (M6, K10, P3, F23), two from each library. Of these clones M6 from the pan- 
tissue library was determined to be the most complete, but missing approximately 44 bp from the 
N-terminus. A protein slightly smaller than that predicted for the complete cDNA was observed 

20 by in vitro translation. The result that proteins were observed by in vitro translation of the 

incomplete cDNA, suggests possibility of alternate translation initiation sites within HDAC9. 
Specifically, sequencing of HDAC9 in pCMVSport6 was performed using an automated ABI 
Sequencer (ACGT, Northbrook, IL). PCR was performed using conditions listed in the ABI 
Prism BigDye™ Terminator Cycle Sequencing Ready Reaction Kit manual and are as follows: 

25 denaturation at 96°C for 30 seconds, annealing at 50° C for 15 seconds, extension at 60°C for 4 
minutes, for a total of 25 cycles. Each round of sequencing provided between 200 and 600 bp of 
sequence. PCR primers for 1 st round sequencing were 5-ATTTAGGTGACACTATAG -3» (Sp6, 
sense) and S'-TAATACGACTCACTATAGGG -3' (T7, antisense). Results of sequencing using 
Sp6 primer are as follows. Bolded sequence is pCMVSport6 vector sequence. 

30 CTggtACCGGTCCGGAATTCCCGGGATATCGTCGACCCACGCGTCCG/GGCTGCT 
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CCCGGCCGAAGCCCCGAGTGCGAGATCGAGCGTCCTGAGCGCCTGACCGCAGCCCT 

GGATCGCCTGCGGCAGCGCGGCCTGGAACAGAGGTGTCTGCGGTTGTCAGCCCGCG 

AGGCCTCGGAAGAGGAGCTGGGCCTGGTGCACAGCCCAGAGTATGTATCCCTGGTC 

AGGGAGACCCAGGTCCTAGGCAAGGAGGAGCTGCAGGCGCTGTCCGGACAGTTCGA 

CGCCATCTACTTCCACCCGAGTACCTTTCACTGCGCGCGGCTGGCCGCAGGGGCTGG 

ACTGCAGCrGGTGGACGCTGTGCTCACTGGAGCTGTGCArAAATGGGCTTGCCCTGG 

TGAGGCCTCCCGGGCACCATGGCCAGAGGGCGGCTGCAACGGGTTCTGCGTGTTCA 

ACAACGTGGCCATAGCAGCTGCACATGCCAAGCAGAAACACGGGCTACACAGGATC 

CTCGTCGTGGACrGGGGGATGTGCACCATGGCAGGGGGATCCAGTATCTCTTTGAAG 

GATGACCCCAGCGTCCTTTACTTCTCCTGGCACCGCTATGAGCATTGGGCGCCTTCT 

GGCCTTTCTGCGAGAGTCAGATGAgACGCATGGGGGGCGGGGGACAGGGCCTCGGC 

TTCACTGTCaACCTGCCCTGACCAAGTTgGGGGAATGGGGAAACGCTGACTTACGTG 

GCTGGCCTTCTTGCACCTTGCTGGTTCCAcTGGCCTTTTGGAGTT^ 

GTGCTTGGTcTCgGCAGGGATTTGACTcagcCaTtCgGGACCCTGAgGGGGCAAA. Results 

of sequencing using the T7 primer were: 

TCAAGCCACCAGGTGAGGATGGCACTGCAACATCTTCCACTGAGGCTCCAGCTGCCC 

TCTCAGGTACATCAGGGCCTGGACGTCCTCTGGGGAGGCCACAGAGGAAGGGCCTA 

GGCTAGGAGGTGCCTCTCCATTCAGCACCCGGGCCAGGATCCCTGCTAGCTGGGGTG 

TGGAGTTCTCCTCCAGGAGGGCCAGGACTCGGCCCCCTGCCAGCCCCCGAAGCATTG 

CAGCCAGGAGTGCAGCGTGGGGGCCCTGCAGGCCATGGCCAGGCCCCAGCGCCACC 

AGCACCAGGTCAGGCTGGAAGCCATAGGCCAGGGGCAGCaCCAAGCCCAAGATGCA 

GCTCAGGAAACCACCGGTCATCACTGGCAGTGGCGTGGAGACATGGAACATGGA[T 

AGGGCAGcCGCCTCCTTGCCCTGATGTTCAGCCACAGACTcCTCCCGTCATGGGCGA 

AGTCTGGAGGCCGGTCCAgCTGTtaGGCCACGCACAGAgtCTCTGGGCTCCgtGGGACA 

gGCCT:TTTtGAAAAGAtA.TTtAGGGTGGGTTGTGAacaggGCTGGAATGGCTGGTATAcC 

AcTGtTTAcCTGCCATT. 2 nd and 3 rd round sequencing primers are designed to prime sequence 

obtained from the previous round of sequencing. 2 nd round primers are 5'-GTCATCA 

CTGGCAGTGGCGTG -3' (HUF7392, antisense) and 5 '-TGGACTGCAGCTGGTGG -3' (DF-2, 

sense). Results of sequencing using the DF-2 primer were: CTGGcAAATGGGCTTGCCCTGG 

TGAGGCCTCCCGGGCACCATGGCCAGAGGGCGGCTGCCAACGGGTTCTGCGTGTTC 
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AACAACGTGGCCATAGCAGCTGCACATGCCAAGCAGAAACACGGGCTACACAGGAT 

CCTCGTCGTGGACTGGGATGTGCACCATGGCCAGGGGATCCAGTATCTCTTTGAGGA 

TGACCCCAGCGTCCITTACTTCTCCTGGCACCGCTATGAGCATGGGCGCTTCTGGCCT 

TTCCTGCGAGAGTCAGATGCAGACGCAGTGGGGCGGGGACAGGGCCTCGGCTTCAC 

TGTCAACCTGCCCTGGAACCAGGTTGGGATGGGAAACGCTGACTACGTGGCTGCCTT 

cCTGCACCTGCTGCTCCCACTGGCCTTTGAGTTTGACCCTGAGCTGGTGCTGGTCTCG 

GCAGGATTTGACTCAGCCATCGGGGACCCTGAGGGGCAAATGCAGGCCACGCCAGA 

GTGCTTCGCCCACCTCACACAGCTGCTGCAGGTGCTGGCCGGCGGCCGGGTCTGTGC 

CGTGCTGGAGGGCGGCTACCACCTGGAGTCACTGGCGGAGTCAGTGTGCATGACAG 

TACAGACGCTGCTGGGTGACCCGGcCCCACCCCTGTCAGGGCCAATGGCGCCATGTC 

AGAGTGCCCTAgAgTCATTCAgAGTGCCCGTGCTGCCAGGcCCCGCACTGGAAAgAgG 

CTTCAgCAGCAAgATGTGACCGcTGTGCCGATGAACCCCA. Sequencing results for the 

HUF7392 primer were: TGtaTAGGGcAGCCGCCTCCTTGCC 

CCTGATGTTCAGCCACAGACTCCTCCCGTCATGGGCGAGG 

TCTGGAGGCCGGTCCAGCTGTCCCAGGGCCACGCACAGCAGCCTCTGGGCTCCGTG 

GGACAGGCCTCTCCGAACAGCCACATCCAGGGTGGCTGCTGCAGCAGAGGCTGGAG 

TGGCTGCTATACCACTGTTCACCTGCCCATCCAGCATCCCATCTAAGAGGTACAGGA 

GCTTCCCAAGTGCAGTGAGGGCCTCCTCCCGGGCCAGGGACTCGTGTGGCCTGGCCC 

AGGCTTCTGTCTCCTCCCTCAGGGCTGACGCTTCTGTTGGATGACGTCAGGGGGCAG 

AACCAATGTGATATCCGGCGTTGTCAAGGGCAACAGCGGTGCGGACAGAGGGTGCG 

GGGCAGAGGCACgGCTGGTCCAgGAGGGAGCTCGGTGCAgATGCAGcTGCCTTACAC 

ACTGgACCCCCAGGCAGCAGAGGTGGAGGCCTCCCCTCTGGGGAGTG. 3 rd round 

sequencing primers were 5-AACAGCGGTG C GGACAGA -3' (HUF2A, antisense) and 5'- 

CTGGAGTCACTGGCGGAG -3* (DF3A, sense). Results of sequencing using DF3 A primer 

were: AgcaCAGA cGCTgCTGGGTGACCCGGCCCACCCCTG 

TCAGGGCCAATGGCGCCATGTCAGAGTGCCCTAGAGTCCATCCAGAGTGcCCGTGCT 

GCCCAGGCCCCGCACTGGAAGAGCCTCCAGCAGCAAGATGTGACCGCTGTGCCGAT 

GAGCCCCAGCAGCCACTCCCCAGAGGGGAGGCCTCCACCTCTGCTGCCTGGGGGTC 

CAGTGTGTAAGGCAGCTGCATCTGCACCGAGCTCCCTCCTGGACCAGCCGTGCCTCT 

GCCCCGCACCCTCTGTCCGCACCGCTGTTGCCCTGACAACGCCGGATATCACATTGG 
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TTCTGCCCCCTGACGTCATCCAACAGGAAGCGTCAGCCCTGAGGGAGGAGACAGAA 

GCCTGGGCCAGGCCACACGAGTCCCTGGCCCGGGAGGAGGCCCTcACTGcACTTGGG 

AAGCTCCTGTACCTcTTAgATGGGATGCTGGATGGGCAGGTGAACAgTGGTATA. 

Results of sequencing using HUF2A primer were: TgcaCGGATGGTCCAGGAGGGAGCTCG 

GTGCAAATGCAGCTGCCTTACACACrGGACCCCCAGGCAGCAgAGGTGGAGGCCTC 

CCCTcTGGGGAGTGGCTGCTGGGGCTCATCGGCACAGCGGTCACATCTTGCTGCTGG 

AGGCTCTTCCAGTGCGGGGCCTGGGCAGCACGGGCACTCTGGATGGACTCTAGGGC 

ACTCTGACATGGCGCCATTGGCCCTGACAGGGGTGGGGCCGGGTCACCCAGCAGCG 

TCTGTACTGTCATGCACACTGACTCCGCCAGTGACTCCAGGTGGTAGCCGCCCTCCA 

GCACXjGCACAgACCCGGCCGCCGGCCAGCACCTGCAGCAGCTGTGTGAGGTGGGCg 

AAGCACTCTGGCGTGGCCTGCATTTGCCCCTCAGGGTCCCCGATGGCTTGAGTCAAA 

TCCTGCCGAGACCAGCACCAGCTCAGGGTCAAACTCAAAGGCCAGTGGGAGCAGCA 

GGTGCAGGAAGGCAGCCACgTATCAGCGTTTCCCATCCCAACCTGgTTCCAGGGGCA 

GGTTGAACAGTGAAGCCGAGGGCCCCTTGTCCCCgCCCCACCTTGCGTCTGCATctGA 

CTCTCGCAGGAAAGGCCAAgAAGCgCCCATgCTATTTT. The overlapping sequence from 

the combined sense and antisense sequencing was reconstructed to give the complete cDNA 

sequence of HDAC9. See Figure 2A. 

BLAST is used to search the Genbank database using cDNA clone M6 as the query to 
identify a genomic sequence containing M6 cDNA sequence. The results of this search identified 
a genomic sequence AL022328 that was found to contain exons that were identical in sequence 
to the M6 cDNA. The sequence of cDNA clone M6 was confirmed by automated DNA 
sequencing (ACGT, Inc. Northbrook, IL). See Figure 2A. 

The remaining 44 bp of N-terminal sequence was added by PCR using the nested sense 
strand primers 5'-GCGGTCGACGCCACCATGGGGACCGCGCTTGTGTACCATGAGGAC 
ATG-3* and 5'-GTGTACCATGAGGACATGACGGCCACCCGGCTGCTCTGGGACGACC 
CCGAGTGC-3 ' and the 3' primer 5 '-GAACCAATGTGATATCCGGCGTTG-3 ' . The 5'primer 
added a kozak sequence and a Sail site for cloning and the 3' primer sequence overlaps the 
EcoRV site in HDAC9. PCR was performed using a step-cycle file for amplification using 1 
cycle of 94°C for 30 seconds, 68°C for 30 seconds, and 72°C for 1 minute, followed by 20 cycles 
of 94°C for 30 seconds and 72°C for 1 minute. 
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Kxam ple 3 HDAC9 sequence variants 

Three variants of the HDAC9 sequence, HDAC9vl, HDAC9v2, and HDAC9v3 were 
found. HDAC9vl is the original sequence found and described above. HDAC9v2 was found in 

5 the human dorsal root ganglion cDNA library and in AL022328 genomic sequence. HDAC9v3 is 
a predicited transcript that lacks a stop codon that was found in the Celera human genomic 
database. HDAC9vl contains 20 exons and HDAc9v2 has 20 exons. Comparison of the peptide 
sequences of HDAC9 variants demonstrated that HDAC9vl and HDAC9v2 were identical up to 
exon 17, but diverge after this exon. HDAC9v2 has an extended intron between exon 17 and 18 

10 and an extended exon 18 that contains HDAC9vl exon 19, but lacks 20, as a result of a single 
nucleotide insertion at nucleotide 446. This insertion fiame shifts the sequence and shortens the 
peptide by 1 1 amino acids (Fig 1 1A). Compared to HDAC9vl and HDAC9v2, HDAC9v3 has an 
internal deletion of amino acids 219 through 240 and diverges in its C-terminal beginning at 
amino acid 486. HDAC9 is the first HDAC enzyme for which sequence variants have been 

15 reported. HDAC9vl is the sequence variant that is characterized, unless otherwise noted. 

Ry ample: 4 Ident ification of HDAC-associated sequ ence motifs. 

The M6 clone was analyzed for the presence of motifs that would indicate an HDAC 

catalytic domain and a binding site for Rb and Rb-like proteins. HDACs are characterized by the 

presence of a catalytic domain with conserved amino acids. Most of the HDACs that have been 

20 identified to date have one catalytic domain, with the exception of HDAC6 that has two 

domains. N-terminal catalytic domains have been associated with class I HDACs, while C- 

terminal catalytic domains are associated with class II HDACs. An N-terminal catalytic domain 

was found in HDAC9 based upon PFAM prediction and alignment with the catalytic domains of 
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other HDACs. A set of conserved amino acids were previously shown to be critical for HDAC 
activity and provide the critical contacts for HDAC inhibitor, TSA, based upon single amino acid 
mutations in HDAC1 and the three dimensional structure formed by a complex of an HDAC-like 
protein (HDLP), Zn 2+ and HDAC inhibitor TSA (Hassig CA, Tong JK, Fleischer TC, Owa T, 
5 GTable PG, Ayer DE, Schreiber SL. (1998) Proc Natl Acad Sci USA. 95, 3519-3524; Finnin, 
M. S., Doniglan, J. R-, Cohen, A., Richon, V. M., Rifkind, R. a., Marks, P. A., Breslow, R., and 
Pavletich, N. P. (1999) Structures of a histone deacetylase homologue bound to TSA and SAHA 
inhibitors. Nature 401, 188-193). A bacterial protein with similarities in sequence and enzymatic 
activity to human HDACs and the only class I HDAC-like structure elucidated, HDLP was used 
10 as an HDAC template. Many of these conserved amino acids with a few exceptions were found 
in HDAC9 (Table 4). Alignments of HDAC peptide sequences indicated that the hydrophobic 
residue Leu 265 that forms part of the binding pocket in HDLP is replaced with Glu at amino 
acid 272 in HDAC9. Similarly, Leu 265 is also replaced with Met in HDAC8 and with Lys in 
HDAC6 domain 1. Furthermore, Asp 173 in HDLP is substituted with Gin at position 177 in 
15 HDAC9, a difference that was also found in the HDAC6 catalytic domain 1 . This Asp is 

substituted with Asn in HDAC4, HDAC5, HDAC6 domain 2, and HDAC7. HDAC1-8 have been 
shown to be catalytically active, hence the amino acid substitutions in these proteins have no 
enzymatic consequences. 

HDAC9 is similar in sequence to class I and class II HDACs. HDACs have been 
20 classified by their sequence similarity with yeast HDACs Rpd3, Hdal, and Sir2 and by catalytic 
domain location. Alignment of the peptide sequences of HDAC9, yeast HDACs Rpd3, Hdal, 
Hdal subfamily member from fission yeast, cryptic loci regulator 3 (Clr3), and Sir2 determined 
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that HDAC9 had the highest sequence similarity with Clr3 (Table 1). However, the sequence 

similarity is not high enough to categorize HDAC9. 

Alignment of human HDACs 1-9 and Sir 1-7 peptide sequences demonstrated that 

HDAC9 was most similar to class II human HDAC6 (Table 2). Alignment of class I and class H 

HDAC catalytic domains with HDAC9 catalytic domains demonstrated that HDAC6 catalytic 

domain 1 has the most sequence similarity with HDAC9 (Table 3). 

In order to compare the locations of catalytic domains in HDACs, PFAM predictions 
were made of the catalytic domains in HDAC peptides (Fig. 1 IB). The location of HDAC9 
catalytic domain was at the N-terminus, similar to class I HDACs, and was estimated as 
spanning the amino acid sequence firom amino acid 4 to 323. In addition, the average length of 
class I HDACs is 443 amino acids, while the average length of class II HDACs is 1069 amino 
acids. The 673 amino acid HDAC9 peptide is between the average sizes of class I and class II 
HDACs (Fig. 115). 



Table 1. 



HDAC 


HDAC 


%Similarity to 


Class 


Isoform 


HDAC9 


Class I 


Rpd3 


16 


Class II 


Hdal 


18 




Clr3 


23 


Class HI 


Sir2 


5 


fable 2. 


HDAC 


HDAC 


% Similarity to 


Class 


Isoform 


HDAC9 


Class I 


HDAC1 


14 




HDAC2 


15 




HDAC3 


15 




HDAC8 


22 


Class II 


HDAC4 


21 




HDAC5 


19 




HDAC6 


37 




HDAC7 


20 
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Class m 


Sirl 


5 




Sir2 


7 




Sir3 


11 




Sir4 


4 




Sir5 


8 




Sir6 


10 . 




Sir7 


15 


fable 3. 


HDAC 


HDAC 


% Similarity to 


Class 


Isoform 


HDAC9 


Class I 


HDAC1 


20 




HDAC2 


20 




HDAC3 


20 




HDAC8 


19 


Class II 


HDAC4 


39 




HDAC5 


38 




HDAC6-1 


55 




HDAC6-2 


53 




HDAC7 


40 



5 The protein product of the retinoblastoma protein (Rb) gene is a transcriptional regulator 

that controls DNA synthesis, the cell cycle, differentiation and apoptosis and plays a tissue- 
specific role normal development Rb complexes with the transcription factor E2F, an interaction 
that is regulated by phosphorylation. Mutations in Rb lead to a hereditary form of cancer of the 
retina, retinoblastoma. Mutations have also been found in a number of mesenchymal and 

10 epithelial cancers. Mutations that affect regulators of Rb phosphorylation including, cyclin Dl, 
cdk4, and pl6 have been found in many cancers. Therefore, Rb function is thought to play a 
critical role in tumorigenesis (Sellers, W.R., Kaelin, W.G. Jr. (1997) J. Clin. Oncol 15, 3301- 
3312, DiCiommo, D., Gallie, B.L., Bremner, R.(2000) Semin. Cancer Biol 10, 255-269). An Rb- 
binding motif was previously defined as the amino acid sequence LXCXE, where "X" can be 

15 any amino acid (Chen, T.-T. and Wang, J. Y. J. (2000) Mol Cell Biol 20, 5571-5580 ). The 

LXCXE domain in HDAC1 was found to be dispensible for growth suppression function of Rb, 
but necessary for HDAC binding to Rb. Two putative Rb-binding motifs were found in HDAC9 
(Fig. 1 IA, green boxes). LLCVA is located between amino acids 510 and 515, and LSCIL 
located between amino acids 560 and 564. Both are present in HDAC9vl and HDAC9v2. 
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Tram ple 5:mRNA distribution of HDAC9 in normal tissues 

mRNA distribution of HDAC9 in normal tissues is investigated using Northern analysis. 
Probes are prepared by 32 P-labeling a 750 bp EcorV/Notl HDAC9 fragment using Redi-Prime 
random nucleotide labelling kit according to manufacturer's instructions (Amersham, 
Piscataway, NJ). A Northern blot containing polyA+ RNA froml2 normal tissues (Origene 
Technologies, Rockville, MD) and an array of matched tumor versus normal cDNAs (Clontech, 
Palo Alto, CA) are probed with the [ 32 P]-labeled 750 bp EcorV/Notl HDAC9 fragment and 
washed under high stringency conditions (68°C). Hybridized blots are washed two times for 15 
min at 68°C in 2 X SSC /0. 1% SDS followed by two 30 min washes in 0. 1 X SSC/0. 1% SDS at 
68°C. The blot is exposed to film with an intensifying screen for 18 hr. Results indicate that an 
approximately 3.0 Kb HDAC9 mRNA is detected in brain, colon, heart kidney, liver, lung, 
placenta, small intestine, spleen, stomach and testes. HDAC9 message was not detected in 
muscle, but GAPDH was also not detected. See Figure 7. 

Analogous computer techniques using BLAST (Altshul, SJF. 1993, 1990 refe) are used to 
search for identical or related molecules in nucleotide databases such as GenBank or the 
LIFESEQ™ database. The basis of the search is the product score which is defined as: 
% sequence identity x % maximum BLAST score 
100 

The product score takes into account both the degree of similarity between two sequences and 
the length of the sequence match. For example, with a product score of 40, the match will be 
exact within a 1-2% error; and at 70, the match will be exact. Homologous molecules are 
usually identified by selecting those which show product scores between 15 and 40, although 
lower scores may identify related molecules. 

The results of Northern analysis are reported as a list of libraries in which the transcript 
encoding HDAC9 occurs. Abundance and percent abundance are also reported. Abundance 
directly reflects the number of times a particular transcript is represented in a cDNA library, and 
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percent abundance is abundance divided by the total number of sequences examined in the 
cDNA library. 

In this case, electronic Northern analysis of LEFESEQ™ database (Incyte 
Pharmaceuticals, Inc. Palo Alto, Calif) indicates tissue distribution of the HDAC9 sequence as 
seen in Table 5. These results are reported as a list of cDNA libraries in which the transcript 
encoding HDAC9 occurs. The presence of HDAC9 in 20 libraries from different tissue-specific 
and mixed tissue sources indicates that HDAC9, like other HDAC family members may be 
found as an expressed gene in a wide range of tissues. This result is supported by the Northern 
hybridization of an HDAC9 probe to mRNAs from 12 normal tissues (see Figure 7). 

Table 5. Tissue distribution determined electronically from LIFESEQ™ database. 

Tissue Category 

Cardiovascular System 

Connective Tissue 

Digestive System 

Embryonic Structures 

Endocrine System 

Exocrine Glands 

Genitalia, Female 

Genitalia, Male 

Germ Cells 

Hemic and Immune System 

Liver 

Musculoskeletal System 

Nervous System 

Pancreas 

Respiratory System 

Sense Organs 

Skin 

Stomatognathic System 

Unclassified/Mixed 

Urinary Tract 

Example 6: Real time PCR survey of HDAC9 distribtuion in human no rmal tissues and cell 
lines. 

Real Time PCR. Total RNA from cultured cell lines was isolated with the Rneasy 96 kit 
according to the manufacturers protocol (Qiagen, Valencia CA). RNA from human tissues was 
purchased (Clontech Inc, Palo Alto, CA) and the tissue sources are listed in table 6 below. 
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Table 6. Tissue sources of RNA for real time PGR analysis 



Tissue 


Sex oi 
donor 


Age range 
oi oonor 
(yrs.) 


l\j tint av* tf\w 

samples 
pooieu 


Brain 1 


M 


J / 


i 
i 


Brain 2 


r 




9 \ 


Cerebellum 


IV1 




1 
i 


Spinal cord 


M/F 


17-72 


31 


Fetal brain 


MJr 


OA OI T*/trc» 
ZU-ZJ WKS 


Q 

o 


Trachea 


MJr 


1 1 ICS 
1 /- /U 




Liver l 


M 


Z / 


1 

1 


Liver 2 


MJr 




9 


Fetal liver 


? 


15-24 wks 


? 


Stomach 


M/r 


Zj-Oi 


i c 


Pancreas 


M/F 


17-69 


18 


Colon 


TV iff 

M 




O 

Z 


Intestine 


M/F 


25&30 


2 


Kidney 


M/F 


24-55 


O 
O 


Bone 
marrow 


M/F 


18-68 


24 


Spleen 


M 


22-60 


/ 


Thymus 


M 


6-45 




Thyroid 


M/F 


10-46 


4 


Adrenal 
gland 


JYL 






Salivary 
gland 


M/F 


13-78 


43 


Mammary 
gland 


F 


23-47 


8 


Skeletal 
muscle 


M/F 


23-56 


10 


Testis 


M 


28-64 


25 


Prostate 1 


M 


26-64 


23 


Prostate 2 


M 


14-60 


10 


Placenta 


F 


22-41 


15 



from the 



Numbers following tissues represent separate samples 
same tissue type: Male (M). Female (F) 

Human cell lines, H1299 human lung carcinoma, T24 bladder carcinoma, SJRH30 muscle 

rhabdomyosarcoma, SJSA-1 osteosarcoma, human fibroblasts, and A549 human lung carcinoma, 

were obtained from American Type Tissue Culture Collection. Total RNA was isolated from 
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human cell lines using RNA easy kit according to the manufacturers instructions (Qiagen, 
Valencia, CA ). RNAs were quantified using RT-PCR on an ABI Prism Sequence Detection 
System. The primers used for detection of HDAC9 were forward primer 5'- 
GGATCCAGTATCTCTT TGAGGATGAC-3% reverse primer 5'- 
5 AGAAGCGCCCATGCTCATA-3 % and Taqman probe 5 * -AGCGTCCITTACT 

TCTCCTGGCACCG-3'. TheTaqman Reaction System (Eurogentec, Belgium) was used with 10 
ng total RNA in a 25 \xl reaction in the proportions indicated by the manufacturer but 
supplemented with 0.25 U/pl reverse transcriptase (MultiScribe ABI, Perkin Elmer, Branchburg 
NJ) and 0.08 XJ/\il RNaseOUT RNAse inhibitor (Life Technologies, Gaithersburg, MD). The 
10 reverse reaction was initiated with a 5 min incubation at 48 °C for the reverse transcription of the 
mRNA followed by a 10 min incubation at 95 °C to inactivate the reverse transcriptase and 
simultaneously activate the •hot-start' thermostable DNA polymerase. This was followed by 50 
cycles of a two-step PCR reaction with alternating 1 5 sec at 95 °C and 60 sec at 60 °C. 
Computations were performed using ABI sequence detection software (version 1.6.3). The RT- 
1 5 PCR assays were standardized with cRNAs transcribed in vitro with the 17 RNA polymerase 
reaction using the Maxiscript kit (AMBION Inc., Austin, TX) according to the manufacturers 
protocol. The RT-PCR assays were standardized with a dilution series of total RNA isolated 
from A549 lung tumor cells. Parallel to the RT-PCR, the total amount of RNA in each reaction 
was quantitated in a fluorometric assay using the RiboGreen kit (Molecular Probes Inc., address) 
20 according to the manufacturers instructions, using mammalian ribosomal RNA provided with the 
kit as standard. 

Real time PCR was also used to survey the distribution and levels of HDAC9 in tissues 
and tumor cell lines, relative to the levels of 18S ribosomal RNA . RNA from the human A549 
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lung carcinoma cell line was arbitrarily chosen as an internal control for the levels of total RNA 
in the samples. The levels of HDAC9 and 18S rRNA in A549 cells were set at 100 % and the 
levels of HDAC9 and 18S rRNA in other tissues and cell lines were measured as a percent of the 
level of these genes in A549 RNA. The levels of 1 8S ribosomal RNA ranged between 82% and 

5 126% of the A549 internal control in all of the RNA samples, suggesting that there were similar 
amounts of RNA in the analyzed tissue samples. HDAC9 was detected at varying levels by real 
time PCR in a wide range of tissues (Fig. 8), confirming the Northern blot analysis (Fig. 7). In 
normal tissues, HDAC9 was detected at the highest levels in fetal brain (894%), cerebellum 
(538%), and thymus (589%). In tumor cell lines, HDAC9 was detected at the highest levels in 

10 SJRH30 cells (850%) (Fig. 8). These results suggest that HDAC9 is differentially expressed in 
some tissues at the RNA level. 

Exam ple 7:HDAC Enzvme Assay 

Preparation of HDAC9-flag. A flag epitope tag sequence was added to the 3' end of 

HDAC9vl by PCR. The PCR primers were 5-ACGCCGGATATCACATTGGT TCTGC-3* and 

15 5'-GCGGAATTCTTATTATTTA^ 

GTCGACAGCCACCAGGTGAGGATGGCA -3\ The flag-tagged HDAC9vl was reconstructed 
using the EcoRV site in the 1 st primer and subcloned into the Xbal and EcoRI sites of human 
expression vector pCDN A3. l(-) (Invitrogen, Carlsbad, CA). 

HDAC activity assay. HDAC activity assays are performed as previously described 

20 (Emiliani, S., Fischle, W., Van Lint, C, Al-Abed, Y., and Verdin, E. (1998) Proc. Natl Acad. 
Set U.S.A. 95, 2795-2800). 5xl0 6 293 cells grown to 50% confluency in 100 mm dishes are 
transfected with 30 ug of C-terminally flag-tagged HDAC1, HDAC3, HDAC4, HDAC6, 
HDAC7, or HDAC9 using Geneporter transfection kit according to the manufacturers 



SUBSTITUTE SHEET (RULE 26) 



WO 02/050285 



63 



PCT/EP01/14928 



instructions. The cell culture medium is changed 5 h after transfection. 48 h after transfection 
cells are washed in cold PBS and scraped into 1 ml of IP buffer (50mM Tris-HCl pH 7.5, 
120mM NaCl, 0.5mM EDTA, 0.5% NP-40) and incubated on a rocker for 20 min. Cellular 
debris is pelleted in a centrifuge at 14K for 20 min. The supernatant is precleared for 1 h with 
5 protein G beads (Pharmacia Biotech) in IP buffer. Immunoprecipitations are performed by 

incubating the precleared supernatant with either of-FLAG M2 agarose affinity gel (Sigma) for 2 
h at 4°C or anti-HDAC2 (Santa Cruz) for 1 h followed by incubation with protein G beads for 1 
h at 4°C. The beads are then washed three times for 5 min in IP buffer and then washed three 
times in high salt IP buffer (50mM Tris-HCl pH 7.5, 1000 mM NaCl, 0.5mM EDTA, 0.5% NP- 
10 40) at 4°C. IPS are then washed two times for 2 min in 1ml of HD-buffer (lOmM Tris-HCl pH 
8.0, lOmM NaCl, 10% glycerol). When trapoxin inhibition is determined Ips are incubated with 
0.3, 3, 30 and 300 nM TPX in HD-buffer for 20 min. Supernatants are incubated with 100000 
cpm substrate ([ 3 H].Ac(H41.24) SGRGKGGKGLGKGGAKRHRKVLRD, in vitro/chemically 
acetylated using BOP-chemistry) in 30 ul HD-buffer or TPX in HD-buffer, resuspending the 
1 5 sepharose by gently tapping the tube and shaking in an Eppendorf 5436 Thermomixer at full 
speed at 37°C for 2h. 170 ul HD-buffer and 50ul stop-mix (1M HC1, 0.16M HAc) are added, 
vortexed for 1 5 f min, 600ul ethylacetate is then added and vortexed for 45 minutes, then 
centrifuged at 14000g for 7 minutes. 540 ul of the organic (upper) phase is then counted in 5 ml 
scintillation liquid using conventional techniques. 
20 HDAC9 is catalytically active. In vitro histone deacetylase assays using 

immunoprecipiated HDAC9 and an 3 H-acetylated histone H4 peptide as substrate were 
performed to determine whether HDAC9 was catalytically active and to compare the activity of 
HDAC9 to known catalytically active HDAC1, HDAC3, and HDAC4. An HDAC-related protein 
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that lacks catalytic activity, HDRP/MTTR/HDACC was used as a negative control (Zhou, X., 

Richon, V.M., Rifkind, R.A., Marks, P.A. (2000) Identification of a transcriptional repressor 

related to the noncatalytic domain of histone deacetylases 4 and 5. Proc Natl Acad Sci USA 97, 

1056-61). These results demonstrated that HDAC9 could deacetylate the histone peptide 

5 substrate at a level that was equivalent to HDAC3 and HDAC4 (Fig. 12A) 9 while HDAC1 was 

more effective in this assay (Fig. 122?). 

Exam ple 8 HDAC9 expression and cellu lar localization 

HDAC9 is expressed in vitro using 1 ug of the M6 clone, 2 ul of 35 S-Methionine and Sp6 

TNT Quick Coupled Transcription/Translation System according to manufacturer instructions. 
10 (Promega, Madison, WI). Proteins are electrophoresed on a SDS-PAGE gel according to 
conventional methods and visualized by a Storm phosphorimager. The complete HDAC9 
sequence molecular weight is estimated in silico as 72 kda using VectorNTI Suite software 
(Informax, North Bethesda, MD). A doublet was observed on a 10% SDS-PAGE gel. Doublets 
have also been observed when HDAC1 is translated in vitro. These doublets suggest that there is 
15 potentially a second translation initiation site. Furthermore, these results suggest that HDAC9 is 
an expressed gene. See Figure 13. 

1X10 5 Cos7 cells are plated onto chamber slides. Cells are transfected on the slides with 
2 ug of flag epitope-tagged HDAC9 or a cytoplasmically expressed protein (Ena-flag) using 
Geneporter2 in serum free medium according to the manufacturers instructions. The cell culture 
20 medium is changed 24 h after transfection. 48 h after transfection, cells are washed three times 
with PBS, fixed for 15 min. in 5% formaldehyde, washed two times in PBS, and blocked for 30 
minutes at room temperature in 10% fetal calf serum (Sigma) in PBS with 0.5% Triton-X-100 to 
permeablize the cells. The cells are washed again two times in PBS and then incubated with 25 
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mg/ml anti-Flag-FITC conjugate for 1 hour. The stained cells are washed with PBS and 
photographed using fluorescence microscopy. 

HDAC9 is a nuclear protein. The translated HDAC9 peptide sequence predicts a 72 
Kda protein and this was confirmed by in vitro translation (Fig. 13^4). In order to determine the 
5 cellular localization of HDAC9, flag epitope-tagged HDAC9, Enabled (Ena) or pCMV 4flag 
were transfected into Cos7 and 293 cells or cells were mock transfected without plasmid. The 
flag epitope was detected by fluorescence immunocytochemistry 48 h after transfection (Fig 
13B). Ena is a cytoskeleton-associated cytoplasmic protein substrate of Abl tyrosine kinase that 
transduces the axon-repulsive function of the Roundabout receptor during axon guidance 
10 (Gerfler FB, Comer AR, Juang JL, Ahem SM, Clark MJ, Liebl EC, HofiBnann FM. (1995) 
enabled, a dosage-sensitive suppressor of mutations in the Drosophila Abl tyrosine kinase, 
encodes an Abl substrate with SH3 domain-binding properties. Genes Dev. 9, 521-533.Bashaw 
GJ, Kidd T, Murray D, Pawson T, Goodman CS. (2000) Repulsive axon guidance: Abelson and 
Enabled play opposing roles downstream of the roundabout receptor. Cell.101, 703-715). As 
15 expected, Ena was detected in the cytoplasm, whereas HDAC9 was detected in the nuclei of 
these cells. The detection of HDAC9 in the nuclei of both Cos7 and 293 cells suggested that 
HDAC9 was predominantly a nuclear protein. 

Example 9: Identification of associated proteins in HDAC complices 

Transfection. 1X10 7 Cos7 cells are transfected with 10 ug of either C-terminally flag 

20 epitope-tagged HDAC1, HDAC2, HDAC3, HDAC4, HDAC6, HDAC7, or HDAC9 in 

pCDNA3.1 expression vector or Flag vector or buffer (Mock) as transfection controls, by 

electroporation using a Gene Pulser II instrument (Biorad, Hercules CA) set at 0.3Kv/ 500 uF. 
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Immunoprecipitation. Immunoprecipitations are performed as described (Grozinger, C. 
M., Hassig, C. A., and Schreiber, S. L. 1999. Proc. Natl. Acad Sci. USA 96, 4868-4873). Whole 
cell extracts are prepared 48h after transfection by scraping cells into JLB buffer (50 mM Tris- 
HCL, pH 8, 150 mM NaCl, 10% glycerol, 0.5% Triton-X-100) containing complete protease 
5 inhibitor cocktail (Boehringer-Mannheim). Lysis is continued at 4°C for 10 min. and then 
cellular debris is pelleted by centrifugation at 14K for 5 minutes. Supernatants are pre-cleared 
with Sepharose A/G-plus agarose beads (Santa Cruz). Recombinant proteins are 
immunoprecipitated from pre-cleared supernatant by incubation with o-FLAG M2 agarose 
affinity gel (Sigma) for 2 h at 4°C or anti-HDACl (Santa Cruz, Santa Cruz, CA) for 1 h at 4°C, 
10 followed by incubation with Sepharose A/G beads. For Western blot analysis, the beads are 

washed with MSWB buffer (50 mM Tris-HCl, pH 8, 150 mM NaCl, 1 mM EDTA, 0.1% NP-40) 
and the proteins are separated by SDS/PAGE. Western blots are probed with anti-flag M2 
(Sigma), HDAC1 (Santa Cruz ), anti-HDAC2 (Santa Cruz), anti-HDAC6 (Santa Cruz), anti-Rb 
(Pharmingen), or anti-mSin3A (Transduction Labs, Lexington, KY) 
1 5 HDAC9 associates with proteins in the mSin3 A complex. Class I HDACs, but not 

class II HDACs were previously found to be associated with the mSin3A complexes. The core 
HDAC1 complex consists of HDAC1, HDAC2, RbAp46, RbAp48. This core complex has been 
found to associate with an mSin3A complex that is involved in transcriptional repression through 
an Rb and E2F complex (Luo RX, Postigo AA, Dean DC.(1998) Rb interacts with histone 
20 deacetylase to repress transcription. Cell. 92, 463-473; Magnaghi-Jaulin L, Groisman R, 

Naguibneva I, Robin P, Lorain S, Le Villain JP, Troalen F, Trouche D, Harel-Bellan A. (1998) 
Retinoblastoma protein represses transcription by recruiting a histone deacetylase. Nature. 391, 
601-605; Brehm A, Miska EA, McCance DJ, Reid JL, Bannister AJ, Kouzarides T. (1998) 
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Retinoblastoma protein recruits histone deacetylase to repress transcription. Nature. 391, 597- 
601). In order to determine whether HDAC9 was a part of this complex, endogenous HDAC1, 
HDAC2, Rb, and mSin3 proteins were co-imunoprecipitated from cells transfected with flag- 
epitope tagged HDAC1, HDAC3, HDAC4, HDAC6, HDAC7or HDAC9. To assure that 

5 transfected flag epitope-tagged HDACs could be detected in cells, the levels of HDAC 

expression were detected by immunoprecipitation and Western blotting with antiserum to the 
flag epitope. To determine which HDACs associated with components of the Sin3 complex, 
endogenous proteins in the Sin3 complex were immunoprecipitated and the associated HDACs 
were detected by Western blotting flag epitope-specific antibody HDAC9 was found to associate 

10 with HDAC1, HDAC2., Rb, and mSin3A, suggesting that HDAC9 is a component of an mSin3A 
complex. 

HDAC9 associates with SMRT and NCoR. Since corepressors SMRT and NCoR 
associate with the mSin3 core complex, experiments were performed to co-immunoprecipitate 
HDACs with NCoR and SMRT (Fig. 15). HDAC9 co-immunoprecipitated with both of these 

15 proteins, suggesting that HDAC9 associates with SMRT, and NCoR. Western analysis of the 

flag-detected blots with anti-NCoR indicated that NCoR was immunoprecipitated. As previously 
reported, SMRT co-immunoprecipitated with HDAC4 and HDAC6, and HDAC6 and HDAC7 
did not associate with the Sin3 A complex. 

HDAC9 associates with 14-3-3 and Erk proteins. HDAC4 was previously found to 

20 associate with 14-3-3-p, 14-3-3-6, CamK, Erkl, and Erk 2 proteins, which sequester HDAC4 in 
the cytoplasm and prevent phosphorylated HDAC4 and HDAC5 from entering the nucleus and 
repressing MEF2 activated transcription. In order to determine whether HDAC9 associate with 
these proteins, experiments were performed to co-immunoprecipitate HDACs with 14-3-3 and 
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Eric proteins. All of the HDACs tested associated with 1 4-3-3 s and Erks. These results suggest 
that the association of HDACs with 14-3-3 and Erics might be a general mechanism of 
sequestering HDACs in the cytoplasm. 

Classification of HDAC9. HDACs have been classified by sequence similarity to yeast 
5 HDACs, sequence length, location of catalytic domain, cellular localization, associating proteins, 
and sensitivity to HDAC inhibitors. The data in this study suggests that HDAC9 has 
characteristics of both class I and class U HDACs. HDAC9 had sequence similarity with class H 
yeast hdal subfamily member Clr3 and HDAC6 catalytic domain 1. In addition, the 3 Kb 
HDAC9 transcript was only detected in kidney and testis, suggesting that it might have a limited 
10 tissue distribution like class II HDACs. HDAC9 was between class I and class II HDACs in 
length. Class I HDACs average 443 bp in length, whereas class II HDACs average 1069 bp in 
length. However, HDAC9 was found to have an N-terminal catalytic domain, as opposed to the 
C-terminal domains that have been found in class II HDACs. HDAC6 is an exception that has 
both N-terminal and C-terminal catalytic domains. Furthermore, class I HDACs are nuclear 
15 proteins, while class II HDACs are nucelo-cytoplasmic. Immunocytochemistry indicated that 
HDAC9 was predominantly nuclear and was detected in a different subcellular compartment in 
comparison to the Ena protein that is expressed in die cyotplasm. In contrast to the 3 Kb HDAC9 
transcript that might be differentially expressed, a 3.5 Kb HDAC9 transcript that was identified 
by Northern analysis was expressed ubiquitously in normal tissues, tumor tissues and cell lines, 
20 similar to class I HDACs. In addition, HDAC9 was found to co-immunoprecipitate with proteins 
that were previously only associated with class I HDAC complexes, including HDAC1, HDAC2, 
mSin3 A, and Rb. HDAC9 also has putative C-terminal LXCXE motifs that so far have only been 
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found in HDAC1 . HDAC9 was also found to associate with NCoR and SMRT. This evidence 
suggests HDAC9 had characteristics that bridged those of class I and class II HDACs. 
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What is claimed is: 

1 . An isolated polypeptide comprising the amino acid sequence set forth in 
SEQ ID NO:l, SEQ ID NO 5 or SEQ ID NO 6. 

2. An isolated polypeptide consisting of the amino acid sequence set forth in 
SEQ ID NO:l, SEQ ID NO 5 or SEQ ID NO 6. 

3. An isolated DNA comprising a nucleic acid sequence that encodes the 
polypeptide of claim 1 or 2. 

4. A vector molecule comprising at least a fragment of the isolated DNA 
according to claim 3. 

5. The vector molecule according to claim 4 comprising transcriptional control 

sequences. 

6. A host cell comprising the vector molecule according to claim 5. 

7. The isolated DNA according to claim 3, comprising a nucleotide sequence 
selected from the group consisting of (1) the nucleotide sequence set forth in SEQ ID NO:2, 7 or 
8, being the complete cDNA sequence encoding the polypeptide as defined in claim 2; (2) the 
nucleotide sequence set forth in SEQ ID NO:3, being the open reading frame of the cDNA 
sequence encoding the polypypetide as defined in claim 2; (3) a nucleotide sequence capable of 
hybridizing under high stringency conditions to a nucleotide sequence set forth in SEQ ID NO:3; 
and (4) the nucleotide sequence set forth in SEQ ID NO:4, being the endogenous genomic 
human DNA encoding the polypeptide as defined in claim 2. 
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8. A vector molecule comprising at least a fragment of an isolated DNA 
molecule according to claim 7. 

9. The vector molecule according to claim 8 comprising transcriptional 
5 control sequences. 

10. A host cell comprising the vector molecule according to claim 9. 



10 1 1 . A host cell which can be propagated in vitro and which is capable upon 

growth in culture of producing a polypeptide according to claim lor 2, wherein said cell 
comprises at least one transcriptional control sequence that is not a transcriptional control 
sequence of the natural endogeneous human gene encoding the polypeptide of claim 2, wherein 
said one or more transcriptional control sequences control transcription of a DNA encoding a 

15 polypeptide according to claim 1 or 2. 



12. A method for the diagnosis of a condition associated with abnormal 
regulation of gene expression which includes, abnormal cell proliferation, cancer, 

20 atherosclerosis, inflammatory bowel disease, host inflammatory or immune response, or 

psoriasis in a human which comprises: detecting abnormal transcription of messenger RNA 
transcribed from the natural endogeneous human gene encoding the polypeptide as defined in 
claim 2 in an appropriate tissue or cell from a human, wherein said abnormal transcription is 
diagnostic of said condition. 

25 13. The method of claim 12, wherein said natural endogeneous human gene 

comprises the nucleotide sequence set forth in SEQ ID NO:4, 7 or 8. 

14. The method of claim 12, comprising contacting a sample of said 
appropriate tissue or cell or contacting an isolated RNA or DNA molecule derived from said 
30 tissue or cell with an isolated nucleotide sequence of at least about 15-20 nucleotides in length 

71 



SUBSTITUTE SHEET (RULE 26) 



WO 02/050285 



PCT/EP01/14928 



that hybridizes under high stringency conditions with the isolated nucleotide sequence as defined 
in claim 3. 

15. A method for the diagnosis of a condition associated with abnormal 
5 HDAC9 expression or activity in a human which comprises: 

measuring the amount of a polypeptide comprising the amino acid 
sequence set forth in SEQ ID NO:l, 5 or 6 or fragments thereof, in an appropriate tissue or cell 
from a human suffering from said condition wherein the presence of an abnormal amount of said 
polypeptide or fragments thereof, relative to the amount of said polypeptide or fragments thereof 
10 in the respective tissue from a human not suffering from said condition associated with abnormal 
HDAC9 expression or activity is diagnostic of said human's suffering from a condition 

16. The method of claim 15, wherein said detecting step comprises contacting 
said appropriate tissue or cell with an antibody which specifically binds to a polypeptide that 

15 comprises the amino acid sequence set forth in SEQ ID NO: 1 , 5 or 6 or a fragment thereof and 
detecting specific binding of said antibody with a polypeptide in said appropriate tissue or cell, 
wherein detection of specific binding to a polypeptide indicates the presence of a polypeptide 
that comprises the amino acid sequence set forth in SEQ ID NO:l, 5 or 6 or a fragment thereof. 

20 17. An antibody or a fragment thereof which specifically binds to a 

polypeptide that comprises the amino acid sequence set forth in SEQ ID NO: 1, 5 or 6 or to a 
fragment of said polypeptides. 

18. An antibody fragment according to claim 17 which is an Fab or Fiab 1 ^ 

25 fragment 

19. An antibody according to claim 17 which is a polyclonal antibody. 

20. An antibody according to claim 1 7 which is a monoclonal antibody. 

30 
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21. A method for producing a polypeptide as defined in claim 1 or 2, which 
method comprises: 

culturing a host cell having incorporated therein an expression vector comprising 
an exogenously-derived polynucleotide encoding a polypeptide comprising an amino acid 
sequence as set forth in SEQ ID NO:l, 5 or 6 under conditions sufficient for expression of the 
polypeptide in the host cell, thereby causing the production of the expressed polypeptide. 

22. The method according to claim 21, said method further comprising recovering 
the polypeptide produced by said cell. 

23 . The method according to claim 2 1 , wherein said exogenously-derived 
polynucleotide encodes a polypeptide consisting of an amino acid sequence set forth in 
SEQIDNO:l,5or6. 

24. The method according to claim 21 , wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID NO:2, 7 or 8. 

25. The method according to claim 21, wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID NO:3. 

26. The method accoding to claim 21, wherein said exogenously-derived 
polynucleotide consists of the nucleotide sequence as set forth in SEQ ID NO:3. 

27. The method according to claim 24, wherein said exogenously-derived 
polynucleotide comprises the nucleotide sequence as set forth in SEQ ID 
NO:4. 
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Fig. 1 

1 GGCGCCGAGG CTTCTGCGTC CGTCGTGGTT CCTCGCTCCG 
41 GGGCGGAGTT CGCGATAGCG ATCGGGGAGC AGGACGCGGG 
81 GCGTGGACCC AGGTCCGAGG CGAGGAAGCC GTAACCCATG 
121 CGCGGGGAGC CTCCCCCTTC GACTGCAGCC TCGCTCCGTG 
161 CCTTCTGCGC GCCTGGGATC CCGGAGCCTG CCTAGGTTCT 
201 GTGCGCTCCC GCCCAGGCCG GTGCCCGCCG CCCGCCTGCG 
241 CCCCAGGCAG GTCCCAGGCC TCCGGCTGCT CCCGGCCGAA 
281 GCCCCGAGTG CGAGATCGAG CGTCCTGAGC GCCTGACCGC 
321 AGCCCTGGAT CGCCTGCGGC AGCGCGGCCT GGAACAGAGG 
351 TGTCTGCGGT TGTCAGCCCG CGAGGCCTCG GAAGAGGAGC 
391 TGGGCCTGGT GCACAGAGTA CCTTTCACTG CGCGCGGCTG 
431 GCCGCAGGGG CTGGACTGCA GCTGGTGGAC GCTGTGCTCA 
471 CTGGAGCTGT GCAAAATGGG CTTGCCCTGG TGAGGCCTCC 
511 CGGGCACCAT GGCCAGAGGG CGGCTGCCAA CGGGTTCTGT 
551 GTGTTCAACA ACGTGGCCAT AGCAGCTGCA CATGCCAAGC 
601 AGAAACACGG GCTACACAGG ATCCTCGTCG TGGACTGGGA 
641 TGTGCAC CAT GGCCAGGGGA TCCAGTATCT CTTTGAGGAT 
681 GACCCCAGCG TCCTTTACTT CTCCTGGCAC CGCTATGAGC 
721 ATGGGCGCTT CTGGCCTTTC CTGCGAGAGT CAGATGCAGA 
761 CGCAGTGGGG CGGGGACAGG GCCTCGGCTT CACTGTCAAC 
801 CTGCCCTGGA ACCAGGTTGG GATGGGAAAC GCTGACTACG 
841 TGGCTGCCTT CCTGCACCTG CTGCTCCCAC TGGCCTTTGA 
881 GTTTGACCCT GAGCTGGTGC TGGTCTCGGC AGGATTTGAC 
921 TCAGCCATCG GGGACCCTGA GGGGCAAATG CAGGCCACGC 
961 CAGAGTGCTT CGCCCACCTC ACACAGCTGC TGCAGGTGCT 
1001 GGCCGGCGGC CGGGTCTGTG CCGTGCTGGA GGGCGGCTAC 
1041 CACCTGGAGT CACTGGCGGA GTCAGTGTGC ATGACAGTAC 
1081 AGACGCTGCT GGGTGACCCG GCCCCACCCC TGT CAGGGCC 
1121 AATGGCGCC 
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Fig. 2 



1 ATGGGGACCG CGCTTGTGTA CCATGAGGAC ATGACGGCCA CCCGGCTGCT 
51 CTGGGACGAC CCCGAGTGCG AGATCGAGCG TCCTGAGCGC CTGACCGCAG 
101 CCCTGGATCG CCTGCGGCAG CGCGGCCTGG AACAGAGGTG TCTGCGGTTG 
151 TCAGCCCGCG AGGCCTCGGA AGAGGAGCTG GGCCTGGTGC ACAGCCCAGA 
201 GTATGTATCC CTGGTCAGGG AGACCCAGGT CCTAGGCAAG GAGGAGCTGC 
251 AGGCGCTGTC CGGACAGTTC GACGCCATCT ACTTCCACCC GAGTACCTTT 
301 CACTGCGCGC GGCTGGCCGC AGGGGCTGGA CTGCAGCTGG TGGACGCTGT 
351 GCTCACTGGA GCTGTGCAAA ATGGGCTTGC CCTGGTGAGG CCTCCCGGGC 
401 ACCATGGCCA GAGGGCGGCT GCCAACGGGT TCTGTGTGTT CAACAACGTG 
451 GCCATAGCAG CTGCACATGC CAAGCAGAAA CACGGGCTAC ACAGGATCCT 
501 CGTCGTGGAC TGGGATGTGC ACCATGGCCA GGGGATCCAG TATCTCTTTG 
1001 AGGATGACCC CAGCGTCCTT TACTTCTCCT GGCACCGCTA TGAGCATGGG 
1051 CGCTTCTGGC CTTTCCTGCG AGAGTCAGAT GCAGACGCAG TGGGGCGGGG 
1101 ACAGGGCCTC GGCTTCACTG TCAACCTGCC CTGGAACCAG GTTGGGATGG 
1151 GAAACGCTGA CTACGTGGCT GCCTTCCTGC ACCTGCTGCT CCCACTGGCC 
1201 TTTGAGTTTG ACCCTGAGCT GGTGCTGGTC TCGGCAGGAT TTGACTCAGC 
1251 CATCGGGGAC CCTGAGGGGC AAATGCAGGC CACGCCAGAG TGCTTCGCCC 
1301 ACCTCACACA GCTGCTGCAG GTGCTGGCCG GCGGCCGGGT CTGTGCCGTG 
1351 CTGGAGGGCG GCTACCACCT GGAGTCACTG GCGGAGTCAG TGTGCATGAC 
1401 AGTACAGACG CTGCTGGGTG ACCCGGCCCC ACCCCTGTCA GGGCCAATGG 
1451 CGCCATGTCA GAGGTGCGAG GGGAGTGCCC TAGAGTCCAT CCAGAGTGCC 
1501 CGTGCTGCCC AGGCCCCGCA CTGGAAGAGC CTCCAGCAGC AAGATGTGAC 
1551 CGCTGTGCCG ATGAGCCCCA GCAGCCACTC CCCAGAGGGG AGGCCTCCAC 
1601 CTCTGCTGCC TGGGGGTCCA GTGTGTAAGG CAGCTGCATC TGCACCGAGC 
1651 TCCCTCCTGG ACCAGCCGTG CCTCTGCCCC GCACCCTCTG TCCGCACCGC 
1701 TGTTGCCCTG ACAACGCCGG ATATCACATT GGTTCTGCCC CCTGACGTCA 
1751 TCCAACAGGA AGCGTCAGCC CTGAGGGAGG AGACAGAAGC CTGGGCCAGG 
1801 CCACACGAGT CCCTGGCCCG GGAGGAGGCC CTCACTGCAC TTGGGAAGCT 
1851 CCTGTACCTC TTAGATGGGA TGCTGGATGG GCAGGTGAAC AGTGGTATAG 
1901 CAGCCACTCC AGCCTCTGCT GCAGCAGCCA CCCTGGATGT GGCTGTTCGG 
2001 AGAGGCCTGT CCCACGGAGC CCAGAGGCTG CTGTGCGTGG CCCTGGGACA 
2051 GCTGGACCGG CCTCCAGACC TCGCCCATGA CGGGAGGAGT CTGTGGCTGA 
2101 ACATCAGGGG CAAGGAGGCG GCTGCCCTAT CCATGTTCCA TGTCTCCACG 
2151 CCACTGCCAG TGATGACCGG TGGTTTCCTG AGCTGCATCT TGGGCTTGGT 

22 01 GCTGCCCCTG GCCTATGGCT TCCAGCCTGA CCTGGTGCTG GTGGCGCTGG 
2251 GGCCTGGCCA TGGCCTGCAG GGCCCCCACG CTGCACTCCT GGCTGCAATG 

23 01 CTTCGGGGGC TGGCAGGGGG CCGAGTCCTG GCCCTCCTGG AGGAGAACTC 
2351 CACACCCCAG CTAGCAGGGA TCCTGGCCCG GGTGCTGAAT GGAGAGGCAC 
2401 CTCCTAGCCT AGGCCCTTCC TCTGTGGCCT CCCCAGAGGA CGTCCAGGCC 
2451 CTGATGTACC TGAGAGGGCA GCTGGAGCCT CAGTGGAAGA TGTTGCAGTG 
2501 CCATCCTCAC CTGGTGGCTT GA 



MGTALVYHED MTATRLLWDD PECE IERPER 
GLVHSPEYVS LVRETQVIiGK EELQAIiSGQF 
AVQNGLALVR PPGHHGQRAA ANGFCVFNNV 
YLFEDDPSVL YFSWHRYEHG RFWPFLRESD 
AFLKLLIjPIiA FBFDPELVLV SAGFDSAIGD 
LEGGYHIjESL AESVCMTVQT LLGDPAPPLS 
LQQQDVTAVP MSPSSHSPEG RPPPLLPGGP 
TTPDITLVLP PDVIQQEASA LREETEAWAR 
SGIAATPASA AAATLDVAVR RGLSHGAQRL 
AALSMFHVST PLPVMTGGFL SCILGLVLPIj 
LRGLAGGRVL ALLEENSTPQ lAGILARVLN 
QWKMLQCHPH LVA 



LTAALDRLRQ RGLEQRCIiRL SAREASEEEL 
DAIYFHPSTF HCARLAAGAG LQLVDAVLTG 
AIAAAHAKQK HGLHRI L WD WDVHHGQGIQ 
ADAVGRGQGL GFTVNLPWNQ VGMGNADYVA 
PEGQMQATPE CFAHLTQLLQ VLAGGRVCAV 
GPMAPCQRCE GSALESIQSA RAAQAPHWKS 
VCKAAASAPS SLIiDQPCLCP APSVRTAVAL 
PHESLAREEA LTALGKLLYL LDGMLDGQVN 
LCVALGQUDR PPDLAHDGRS IiWIiNIRGKEA 
AYGFQPDLVIi VALGPGHGLQ GPHAALIAAM 
GEAPPSLGPS SVASPEDVQA LMYLRGQLEP 
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Fig. 3 
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tccttgcccctgatgttcagccacagactcctccttcc , 

II Ml II I MM II III I III II Mill II II l««< 

tccttgcccctgatgttcagccacagactcctc 



302 



.cctaccc 
««<| | 
cc 



gtcatgggcgaggtctggaggccggtccagctgtcccagggccacgcaca 

II 1 1 1 1 Ml 1 1 1 1 1 1 1 1 1 1 1 1 1 1 m 1 1 1 111 M M 1 1 1 II 1 1 1 1 1 1 1 1 1 

gtcatgggcgaggtctggaggccggtccagctgtcccagggccacgcaca 



1225 qcagcctgga cttacctctgggctccgtgggacaggcctctccga 

lllll««< 139 <««IM IIIIIM MM MM Mill IIIIIM 

492 gcagc ctctgggctccgtgggacaggcctctccga 



1399 acagccacatccagggtggctgctgcagcagaggctggagtggctgctat 

I M I II 1 1 II II II I Ml I II II I II 1 1 1 II I II II I II 1 1 1 1 1 1 1 II II 

527 acagccacatccagggtggctgctgcagcagaggctggagtggctgctat 
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HDAC9 
AL022328 
HDAC9 
AL022328 
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AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 



1449 accactgttcacctgtg cccacctgcccatccagcatcccatcta 

||||||||1MI<«« 725 ««<||||||||||||||||||||lll 
577 accactgttcac ctgcccatccagcatcccatcta 

2209 agaggtacaggagcttcccaagtgcagtgagggcctcctcccgggccagg 

Mill 1 1 lllliMII MM Ml 'I; 1 1 II' MM 1 1 Mill MMI II 

612 agaggtacaggagcttcccaagtgcagtgagggcctcctcccgggccagg 
2259 gactcgtgtggcctgtg cccacctggcccaggcttctgtctcctc 

llllllllllll««< 113 ««<|||||| Mill MMII MINI 

662 gactcgtgtggc ctggcccaggcttctgtctcctc 



2407 



697 



cctcagggctgacgcttcctgttggatgacgtcagggggcagaaccaatg 

MM MM I MMMMIIMM I! MM MM I MIMMIM IMIMI 

cctcagggctgacgcttcctgttggatgacgtcagggggcagaaccaatg 



2457 tgatatccggcgttgtcagggcaacagcggtgcggacagagggtgcgggg 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiimmimiiiiiimi 

747 tgatatccggcgttgtcagggcaacagcggtgcggacagagggtgcgggg 
2507 cagaggcacggctggtccaggagggagctcggtgcagatgcagctgcctt 

MIMMIM IMIMI IIIIIIIMIIIIMI lllllllllll MUM 

797 cagaggcacggctggtccaggagggagctcggtgcagatgcagctgcctt 
2557 acacactggacccccaggcagcagaggtggaggcctcccctctggggagt 

I MM MIMI Mill I III II I III II Mill III 1 1 MIMI I M I II 

847 acacactggacccccaggcagcagaggtggaggcctcccctctggggagt 



2607 ggctgctggggctcatcggcacagcggtcacatctagg . 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiui<«« 

897 ggctgctggggctcatcggcacagcggtcacat 



. ctgacctt 

II 

.ctt 



79 <<«< | 



2722 



933 



qctgctggaggctcttccagtgcggggcctgggcagcacgggcactctgg 

I M II II II 1 1 MM 1 1 Mill Mill II M M MM I MM II IM M 

gctgctggaggctcttccagtgcggggcctgggcagcacgggcactctgg 



2772 atggactctagggcactgtg. - - .cctacctcccctcgcacctctgacat 

iiiiiiiiiiiiiii««< 6i ««<iii inn mill mi in 

983 atggactctagggca ctcccctcgcacctctgacat 



2869 



1019 



ggcgccattggccctgacaggggtggggccgggtcacccagcagcgtctg 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiii 

ggcgccattggccctgacaggggtggggccgggtcacccagcagcgtctg 



2919 tactgtcatgcacactgactccgccagtgactccaggtggtagccgccct 

Mill II Mill Mil III II MMI II III II II 1 1 1 III 1 1 MM l« 

1069 tactgtcatgcacactgactccgccagtgactccaggtggtagccgcc. . 
2966 ggg gtcacctccagcacggcacagacccggccgccggccagcacc 

«< 189 ««<iMiiiiiiiiiii Milium i inn mini 

1116 ctccagcacggcacagacccggccgccggccagcacc 
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AL022328 
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AL022328 
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AL022328 
HDAC9 
AL022328 
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AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 



3193 tacagcagctgtgtgaggtgggcgaagcactctggcgtggcctgcatttg 

illiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiiiiiiiiiiiiii 

1154 tgcagcagctgtgtgaggtgggcgaagcactctggcgtggcctgcatttg 
3243 cccctgga ctcacctcagggtccccgatggctgagtcaaatcctgc 

|||««< 79 ««<| MUM Mill INI I Mil Mill Mill II 

1204 ccc .ctcagggtccccgatggctgagtcaaatcctgc 



3358 cgagaccagcaccagctcagggtcaaactaca . 

iiiiiiiiiiiiiiiiiiiiiiiiiii««< 

1240 cgagaccagcaccagctcagggtcaaa 



. . . . .gtcacctcaaagg 

212 ««<|||MIII 
ctcaaagg 



3605 ccagtgggagcagcaggtgcaggaaggcagccacgtagtcagcgtttccc 

1 1 1 1 1 1 1 1 1 ii ill 1 1 ii 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 

1275 ccagtgggagcagcaggtgcaggaaggcagccacgtagtcagcgtttccc 
3655 atcccaacctggc ggcacctggttccagggcaggttgacagtgaa 

||||||||««< 159 <<<<<||||||||| HUM MINIMUM 

1325 atcccaac ctggttccagggcaggttgacagtgaa 

3849 accQaQciccctQtccccgccccactgcgtctgcatctgactctcgcagga 

Tmiiiiimiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiimii 

1360 gccgaggccctgtccccgccccactgcgtctgcatctgactctcgcagga 
3899 aaggccagaagcgcccatgctcatagcggtgccaggagaagt aaaggacg 

MIIIIIIMIIIIIIIIIIIIIIIIMMMMIMIIIIIIIIIIMI 

1410 aaggccagaagcgcccatgctcatagcggtgccaggagaagtaaaggacg 



3948 ctgcc, 



<<<<< 



180 



1459 



.ctcacctggggtcatcctcaaagagatactggatcccctg 

««<IIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIM 

ctggggtcatcctcaaagagatactggatcccctg 



4164 gccatggtgcacatcccagtccacgacgaggatcctggg 

||IIIIIIIIIIMIIIIIIIIIIIIIIIIIIII««< 156 

1495 gccatggtgcacatcccagtccacgacgaggatc 



cacacc 
<<<<< | 
c 



4355 tgtgtagcccgtgtttctgcttggcatgtgcagctgctatggccacgttg 

I I I II I I I I II I I I I I I I I I I I I II I I I I I II I I I II I I I i I I I I I I I II 

153 0 tgtgtagcccgtgtttctgcttggcatgtgcagctgctatggccacgttg 



4405 



1580 



4455 



1630 



ttciaacacacagaacccgttggcagccgccctctggccatggtgcccggg 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiii 

ttgaacacacagaacccgttggcagccgccctctggccatggtgcccggg 



aggcctacg . 

in 

aggc. 



«<« 98 



.ctcacctcaccagggcaagcccattttgcacagctcc 

««<ll II Mill I Mil I M II I Mill 1 1 II II I 

ctcaccagggcaagcccattttgcacagctcc 



4589 aqtaagcacagcgtccaccagctgcagtccagcccctgcggccagccgcg 

IIIIIIIIMIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIII 

1666 agtgagcacagcgtccaccagctgcagtccagcccctgcggccagccgcg 



5/29 



SUBSTITUTE SHEET (RULE 26) 



WO 02/050285 



PCT/EPO 1/14928 



AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 
AL022328 
HDAC9 



4639 cgcagtgaaaggtactctgtg cgcaccgggtggaagtagatggcg 

II | I III I II I I II I l««< 266 ««<||||||||||||||||||| 
1716 cgcagtgaaaggtact cgggtggaagtagatggcg 



4940 tcaaactgtccggacagcgcctgcagctcctccttgcctaggacctgggt 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMMIMI 

1751 tcgaactgtccggacagcgcctgcagctcctccttgcctaggacctgggt 



4990 ctccctgaccagggatacatactctgggctgca . 

Illlllllllllllllllllllllllll««< 
1801 ctccctgaccagggatacatactctggg 



247 



. ctgacctgtgca 

«<«iiiiin 

ctgtgca 



5272 ccaqqcccagctcctcttccgaggcctcgcgggctgacaaccgcagacac 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIIIIIIIIIMIII 

1836 ccaggcccagctcctcttccgaggcctcgcgggctgacaaccgcagacac 



5322 



1886 



5372 



1936 



ctctattccaggccgcgctgccgcaggcgatccagggctgcggtcaggcg 

iiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiiniiiiiiiiii 

ctctgttccaggccgcgctgccgcaggcgatccagggctgcggtcaggcg 



ctcaggacgctcgatctcgcactcggggctggg 

II MM I II II III Mill I Mill II l««< 

ctcaggacgctcgatctcgcactcgggg 



. . . .cttactcgtccca 
68 ««<|||||||| 
, tcgtccca 



5476 qaacagccgggtggccgtcatgtcctcatggtacacaagcgcgg 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIMIIMIIIIIII 

1972 gagcagccgggtggccgtcatgtcctcatggtacacaagcgcgg 
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Fig. 5 

HDACl MAQTQGTRRKVCYYYDGDVGNYYYG^ - Q<§PMKPHRIRMTHNLIiIiNY6IiYRKMEIYRPHK 

HDAC9 MGTALVYHEDMTATRIiLWIggPBCB IBRPBRLTAALDRIjRQRGIiEQRCLRIiS ARB 

5 *: * . • s. s . :*.*: : :*:**:: .:: 

HDACl ANAEEMTKYHSDD YIKFLR S IRPDNMSE YSXQMQRFNVGED CP VFDGL FEFCQL STGG - - 

HDAC9 ASEEELGLVHSPEYVSLVRETQVIiGKEELQAIiSGQFDA 1 YPHP STFHCARIiAAGAGL 

**. ** :*:.::*. : . .* . :*x. .. . *. .:*::*. 



HDACl SVAS AVKLNKQQTD IAVNWAGGI*! IH VKKSEASGFCYVNDIVIiAIIiEIiLKYH - -QRVLYlbl 
HDAC9 QLVDAVIiTGAVQNGIiAIiVRPPG - PH^QRAAANpFpVFNNVAIAATVHAKQKHGLHRIIiVVpJ 
.s..** . *..:*: . * **.::: *.*** .*::.:* . * * s* 



HDACl iBbHsCTOpVBBAFYTTDRVMTVSPHKYGEY (Fp GTGpIiRD I GAGKGK YYAVNYP LRD 



HDAC9 wbwppdQblQYLFEDDP SVLYFSWHRYEHGRFWP^RESpADAVGRGQGIiGFTVNIiPWNQ 
S.WtSTt:. * *: .*:*:* - ^ s* *:* ::** * 

HDACl G-IDDBSYBAIFKPVMSKVMEMFQPSAVVLQCGSDSLSGDRLGCFNIiTIKGHAKCVEFVK 
HDAC9 VGMGNADYVAAFLHLIiLPIxAPEFDPEIiVLVSAGFDSAIGDPEGQMQATPECPAHLTQLIiQ 
... . * * * : : : *:*. *::..* ** ** * :: * : .*: . : : : : 

HDACl S FN- IiPMLMLGGGgCtIe RNVARCRTYETAVALDTE IP 

HDAC9 VLAGGRVC^VIiEGGKmESLAESVCMTVQTLIjGDPAPPIiSGPMAPCQRCE 
. . . ***s..:*.. - . *- * 

HDACl NELPYN 

HDAC9 RAAQAPHWKSLQQQDVTAVPMSPSSHSPEGRPPPUiPGGPVCKAAASAPSSLLI>QPCLCP 

HDACl DYFEYFGPDFKTiHT SP SNMTNQN TNEYLEKIKQRLFENLRMLPHA 

HDAC9 APSVRTAVALTTPDITIjVLPPDVIQQEASAIiREETEAWARPHESIiAREEAXiTAIjGKLLYL 
**..* :.*. : : j t. : . s * * : 

HDACl PGVQMQAIPEDAIPEESGDEDEDDPDKRIS - - ICSSDKlJlACEEE - -FSDSEEEGEGGRK 
HDAC9 ZJ)GMLDGQVNSGIAATPASAAAATIiDVAVRRGLSHGAQlGjLCTrALGQIiDRPPDI>^ 

• : . s..*. ... * : . :*: * :. . : ...**. 



HDACl WSSNFK f- 

HDAC9 LWLNIRGraAAALSMFHVSTPLPVOTGG]^ 

HDACl KAKRVKTEDEKEKDPEEK KEVTEEEKTKEEKPEAKGVKEEVKL 

HDAC9 GPHLAALIJUVMIiRGIiAGGRVIiAIiIjEENSTPQIiAGIIiARVIiNGEAPP SIiGP S S VASPBDVQA 

. ** : *::. *: .*:*.- *.: . *:*: 



HDACl A 

HDAC9 IiMYIiRGQLEP QWKMLQ CHP HLVA 



a Catalytic amino acids 

JT 1 Potential RB-binding pocket residues 
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Sequence format. 1b Pearson- 



Sequence 


Is 


HDAC1 


482 


aa 


Sequence 


2: 


HDAC2 


488 


aa 


Sequence 


3 s 


HDAC3 


428 


aa 


Sequence 


4: 


HDAC8 


377 


aa 


Seqoence 


5 s 


HDAC4 


1084 


aa 


Sequence 


6: 


HDAC5 


1122 


aa 


Sequence 


7: 


HDAC6 


1122 


aa 


Sequence 


6: 


HDAC7 


855 


aa 


Sequence 


9: 


HDAC9 


673 


aa 



Start of Pa^irwise alignments 



Aligning. . 





(1 :2) 


Al i gned • 


Score ; 


82 




(1:3) 


Aligned . 


Score : 


57 


Sequences 


(1:4) 


Aligned. 


Score : 


38 


Sequences 


(1:5) 


Aligned. 


Scores 


18 


Sequences 


(1:6) 


Aligned. 


Score : 


14 


Sequences 


U:7) 


Aligned. 


Score s 


14 


Sequences 


(1:8) 


Aligned. 


Score : 


15 


Sequences 


(1:9) 


Aligned. 


Score s 


14 


Sequences 


(2:3) 


Aligned. 


Score : 


55 


Sequences 


(2:4) 


Aligned. 


Score s 


39 


Sequences 


(2:5) 


Aligned. 


Score: 


13 


Sequences 


(2:6) 


Aligned. 


Score: 


15 


Sequences 


(2:7) 


Aligned. 


Score : 


15 


Sequences 


(2 re) 


Aligned. 


Score : 


14 


Sequences 


(2:9) 


Aligned* 


Score : 


15 


Sequences 


(3:4) 


Aligned. 


Score: 


37 


Sequences 


(3:5) 


Aligned. 


Score : 


12 


Sequences 


(3:6) 


Aligned. 


Score: 


13 


Sequences 


(3:7) 


Aligned. 


Score : 


13 


Sequences 


(3:6) 


Aligned. 


Score: 


15 


Sequences 


(3:9) 


Aligned. 


Score : 


15 


Sequences 


(4:5) 


'Aligned. 


Score : 


21 


Sequences 


(4:6) 


Aligned. 


Score : 


16 


Sequences 


(4:7) 


Aligned. 


Score : 


16 


Sequences 


(4:8) 


Aligned. 


Score : 


20 


Sequences 


(4:9) 


Aligned. 


Score : 


22 


Sequences 


(5:6) 


Aligned. 


Score : 


59 


Sequences 


(5:7) 


Aligned. 


Score : 


59 


Sequences 


(5:8) 


Aligned. 


Score : 


49 


Sequences 


(5:9) 


Aligned. 


Score :. 


21 


Sequences 


(6:7) 


Aligned. 


Score : 


100 


Sequences 


(6:8) 


Aligned. 


Score : 


43 


Sequences 


(6:9) 


Aligned. 


Score: 


19" 


Sequences 


(7:8) 


Aligned. 


Score : 


43 
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Sequences {1:9) Aligned, Score: 19 

Sequences (8:9) Aligned. Score: 20 

Guide tree file created: l/bioinfnv/soltware/biobenchsw/tmp/align/1478 .dndl 
Start of Multiple Alignment 
There are 8 groups 
Aligning. . . 



Group 


1 : 


Sequences: 


2 


Score :24 259 


Group 


2: 


Sequences : 


3 


Score: 184 15 


Group' 3 : 


Sequences: 


4 


Score: 12 882 


Group 


4 : 






Delayed 


Group 


5: 


Sequences: 


2 


Score :9847 


Group 


6: 


Sequences: 


3 


Score: 75 69 


Group 


7: 


Sequences : 


4 


Score : 5689 


Group 


8: 


Sequences: 


8 


Score: 2 841 



Sequence: 9 Score: 3452 

Alignment Score 36872 

ClAJSTAL- Alignment file created (/bioinfnv/sbf tware/biobenchsw/tmp/align/1478 .out) 
CLUSTA1* W (1.81) multiple sequence alignment 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
KDAC8 
HDACS 



MNSPOTSDGMSGKEPSI-EILPRTSLHSIPVTVZVKPVLPJIAMPSSMGGGGGGSPSPVEIJI 
MNSPNESDGWSGREPSLEI LPRTSLHSI PVTVEVXPVLPRAKPSSMGGGGGGSPSPVEIJl 
MSSQSHPDGLSGRDQPVEIOJJPAJlVNHMPsi 

^ • MD1»RVGQRPPVEPPP 



HDAC5 GA1/V G S VDPTLRE OOi,OQEl»I*A I#K QOOQ1«OR QLLF A EFQX pHDHL TR QHEVOLQKHliX QQ 

HDAC6 GAXVGSVT)PTLREOOLTOEl,lJa*O0Q01>OR0^ 

HDAC4 • VAXPALREOOI^OELLAlAOXOQl QRQ I 1*1 AEFORQHE01»SRQBEAQt>HEHI RQQ 

HDAC7 - EPTLIAI^RPC/RLHHHLFLAGJ^Q ..QQ 

HDAC1 ■ — ... 

HDAC2 ■ 

HDAC3 w 

HDAC8 ' 

HDAC9 



HDAC5 
KDAC6 
KDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
HDAC8 
HDAC9 



gEMIJ^OOOEKl^AKROOELEOOROREOOKOEELEKORLEOOUILHKKERSKESAlAS 
OEHLAAJ^QO0EMLAARRO0El»EOOR0REOQROEELEK0RXEQOl»l»l LRNKXXSXESAIAB 

Q EWLAMKJ3 QQE 1XEH QR - -RLERHRQ- - E 0E LEX QHRE CKLQQ LXJ3X EX GXE S A VAS 

RSVEPMRLSMDTP MPELQVGPQBOEIJIOLLHXDRSKRSAVAS 



HDAC5 TEVKLRL0ErLl*SRSREPTPGGL»KHSLPOHPKCWG---AHHASl.I>OSSPP0SGPPGTPPSy 

HDAC6 TEVKLRL0ErLLSRSREPTPGGLNHSLP0HPKCWG--AHHASl^>OSSPP0SGPPGTPP8T 

HDAC4 TEVKMKLOEFVlJna^RAl^Rm^CISSDPRyKyGRTOHSSLBOSSPPOSG VSTST 

HDAC7 S WX0RLAEVI LRRQQAAXERTVHPKSPGI P YRTLEP - LETEG ATRSMLS ST 

HDAC1 ... 

HDAC2 
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HDAC3 
HDACB 
HDAC9 



HDACS 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
KDAC2 
HDACB 
HDACB 
HDAC9 



•KLPLPG-PyDSRDDFPLRKTASEPjnjCVRSBLRgKVAERKSSPLLRRJUJGTVISTFKKRA 
RLPLPG-PYDSRDDrPLKKTASEPmJ^VKSKLROKVAERRSSPLLRRKDGTVISTrKKRA 
NHPVX^-MyDAKDDFPLRXTASEPNLKl^SKLKOKVAERRSSPLLRRKDGPVVTAL 
l»PPVPSLPSDPPEHFPl.RKTVSEPKLKLRyW»K-KSl*ERRKNPLLRKE--SAPFSL^RRP 



HDACS 
JHDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDACB 
HDAC8 
HDAC9 



VElTGAGPGASSV™SAPGSGPSSPK-S5HST3AENGrTGSVPNIPTEMl*POHRAI*PLDfi 
VE3 TGAGPGASSVCNSAPGSGPSSPK - S SHST1 AENGFTGSVPHI PTEMLP QKRA1*PI*DS 

LDVT D S ACSS AP GSGPS 5 PNHS 5 GS VS AENG I APA VPS 2 PAETS1aAHHI#VAREO 

AET1X3 DSSPSSSSTPASGCSSFNDSEHG-r *— 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDACB 
HDACS 
HDAC9 



S PNQFSLYTS PS1*PN1 S LGl^ATVTVTNSHLTASPKLS TCQEAEROALQS I.KQGGTLTGK 
S PHQFS LYTSP SLI^Hl SLG1>QATVTVTNSHLTAS PK1*S TOOEAERO AbQSLROGGTliTGK 

S AAFLFLYTSP SLPH1 TLGLPATG PSA GTAGQODTERLTI»PAX.OQR 

PHFJ LiG DSDRRTHPTLGPRG — 



HDACS 
HDACS 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDACB 
HDACS 
KDAC9 



FMSTSSIPGCLLGVALEGDGSPHGHASLl*OHVl>LLEOAROOSTLIA -VPLHGQSP 

FMSTSSJPGCLbGVALEGDGSPHGHASLMJHVLLLEQARTOSTLIA -VPLHGQSP 

LFPGTHLTPyi^TSPLERDGG-AAHSPLl^OHMVl-LEOPPAOAPLVTGli- - GA1>PLHAQS- 
P 3 IjGSPHTPLFLPHGLEPEAG- GTIiPSRl«QPZ Ll*I-DPSGSHAPU,TVPGI*<3Pl*PFHFAOS 



HDAC5 l,VTGERVATSimTVGKLPRHRPIiSRTOSSPLPOSPOALOOLVMOOOH003F*lj.EKOKQ--** 

HDAC6 l-VTGERVATSMRTVGRLPRHRPLSRTOS SPLP0SPQALOQLVMOOQHQQF LEKQKQ 

HDAC4 LVGADRVSPSIH- - ^KlJlOHRPLGRtOSAPLPONAOA^OHl/VaOOOHOCFl^EKHKOOFOQ 

HDACT7 LMTTERLS GSGUTWPLSRTRSEPLPPSATAPPPPGPMQPRLEOIATHVQ 

HDAC1 

HDAC2 ~ 

HDACB 

HDAC8 " — — 

HDAC9 ' 
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HDACS 
HDAC6 
HDAC4 
KDAC7 
HDAC1 
HDAC2 
HDAC3 
HDAC8 
KDAC9 



OCLOLGKlLTKTGEl^ROPTTHPEETrEELTEOOEVLLGEGALTMPREGSTESESTOHDL 
QQl.Ol^K3LTXTGEl^ROPTTHPEETEEELTE00EVLlrGEGAl.TMPREGSTESESTOEDl# 
001,QMNX3 3 PXF SE PAR OPE SHPEETEEE LK EH 0 - AL.LDEP YLDRLPGQX EAHA Q AGVQV 
- - r -VI RRSAXPSEXPRUlQ:! PSAEDI^TDGGGFGQVVDDGLEHRELGHGOPEARGPAPI, 



KDAC5 
HDAC6 
KDAC4 
HDAC7 
HDAC1 
DAC2 
HDAC3 
HDAC8 
HDACS 



EEEDEEEDGEEEEDC30VKDEEGESGAEEGPPLEEPGAGyKXLFSDAQPl,OPI*OVyOAPIi 
EEEDEEEDGEEEEDC3QVXDEEGESGAEEGPDI>EEPGAGYXXX,FSDAQPl,QFI^VYOAPI# 

X0EP3ESDEEEAE PPREVEPGORQ-FSEQELI^ROOALIXEOORIHQLRKYOASM 

OOHPQVLLWEO0R l^AGR^PRGSTGDTVLLPtAOGGHRPLSRAQ SSPX 



KDAC5 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
KDAC3 
HDAC6 
HDACS 



. SLATVP - HOAl^RTOSSPAAPGGWK.SPPDQPVWn*-PTTGVyyi>Trm^OCMCG^ 

SUlTVP HOAliGRTQSSPAAPGGMKSPPDQPVKKL^ FTTGWYDTFMI*KHQCMCGH 

EAAG3PVSFGGHRP1>SRAOSSPASATFPVSV0EFPTXPR-FTTGLVYDT1*MXJOJQCTCGB 
APASLS APEPASQARVl-SSSETPARTLPFTTGLIYDSVMXJtHOCSCGD 

maotog - TRRKVcyyyr>GDVGNrryoQ 

MAVSQGGGKRKVCyyYDGM GHYYYGO 

MAXTVAYFYDPDVGNFHYGA 

„ * MEEPEEPADSGQSLVPVYI YSPEYVgMCD 

M GTALVYHEDMT A TRLL.WDD 



KDAC5 TUV^PEHAGR10S3VfSRM>ETGLLSRCERlRGRJU^TX4>E10T>mSEYUTI>r.VGTSPUmO 

HDAC6 THVHPEHAGRIOS3WSRLQETGLLSKCER3RGRJU;TI^EIOT\aiSEyHTI*I*VGTSPlJniO 

KDAC4 SSSHPEHAGR10S3WSRL0ETGLRGRCEC3RGR3UVTLEE1^TVHSEAHTI>1>YGTKPI J KRQ 

KDAC7 NSRHPEHAGR3 QSJ WSRLQERGLRSOCECLHGRXASLEEl^SVHSERHVLLY GTNPLSRI* 

HDAC1 G- -HPMXPHR3 RMTHNLLLNYGLYRXME J YRPHXANAEEMTKYHSDDY3 RFLRS I RPDNW 

HDAC2 G- •HPMKPHR3RMTHm>LLNYGlryRK>IE3 YRPHXATAE EMTXYH SDEY3 XF LRS 3 RPDNM 

HDAC3 G • » BPMXPHRLALTH.S1»V1»HY G1/YKXM3 VFXP Y QAS QHDMCRFHSEDY3 DF3jQRV S PTNM 

HDAC8 S- -UVR3PXXASMVHSX*3EAYAUiXQMR3 VXPRVASMEEMATFHTDAYI^HI^OKVSOEGD 

HDACS PECE3ERPERLTAALDRLRORGLEQRCLRLSARILASEEELGLVHSPEYVSLVRETCrVlX3K 

. * . t • • I *. .n •as 

HDACS KLDSKKLLGP3 SOKMYAVLPCGG3GVDSD3VWNEHHS SS AVRMAVGCLLEXiAFJCVAAGBL 

HDAC6 XLDSXXLLGP 3 SOKMYAV3iPCGG3 GVDSDTVWNEMHSSSAVRMAVGCLLErAFKVAAGBL 

HDAC4 XLDSXXlO^Sl^-VFVKl^CGGVGVDSDT3WNEVHSAG^^ 

HDAC7 KU>NGRl*AGLLAORMFEMLPCGGVGVDTE)T3 WHELKS SNAARKAAGSVTOUvFKVASRBL 

HDAC1 SB YSKOMORFKVGEI)CPVFDGl,FEFCOLSTGGSVASAVjaJTROOTD3AVNH 

HDAC2 SE YSXQMH3 FNVGEDCP AFDGliFEFCQLSTGGSVAGAYRUTO OQTDHAVHW 

KDAC3 QG -FTKSLNAFNVGDDCPVFPGLFEFCSRYTGASLQGATOLNNKICDIAINW 

HDAC8 DD HPDSIE- YGLGYDCP ATEG3 FDYAAA3 GGATI TAAQCLI DGMCKVAINW 

•HDACS EE - LOALSGQFDAIYFHPSTFHCARLAAGAGLOLVDAVIiTGAV 

t s • c 

HDACS KNGFA3 3RPPGHHAEESTAMGFCFFNSVA3TAKLLOOK LNVGKVL3VX)VTO3HHGHGT 

HDAC6 XNGFA3 3 RPPGHHAEESTAMGFCFFNSVA3 TAKLLQQK LNVGXVL3VDWDIHHGNOT 

HDAC4 KNGFAWRPPGKHAEESTPMGFCYFNSVAVAAKLLOOR LSVSK3L3VX)WpVHHGNGT 
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HDAC7 W*GFAVVRPPGHHADHSTAMGFCFFNSVA3 ACRQLQOOSKASlUVSFj LXVDWDVHHGNGT 

HDAC1 AGG~ UmAKXSEASGFCYVNDlVlA3LELliKY- H0RVL.YIDID3 HHGDGV 

HDAC2 AGG~ £HHAJU^YEASGFCYVND3 VIA3 LELL.KY HQRV3/YID3D3 HHGDGV 

HDAC3. AGG- LHHAXXFEA£GFCrWD3V3G3LELLKY BPRVLYIDIDI HHGDGV 

HDAC8 SGG- WHHA3UU3EASGFCYLNDAVXG3 LRLRRK FER3 1»YVD1*D1»HH GDGV 

HDAC9 QKG1ALVRPPGHHGQRAAANGFCVFHKVA1 AAAHAXOKHG- - - LHRILWDWDVHHGOGI 

QQAFYNDPSVLY 1 SUBtYDNGNFFPCS - - GAPEEVGGGPGVGYWVNVAWTGGVDPF3 GDV 

HDAC6 OQAFYKDPSVl»y 1 S LHRYDWGHFFPGS - - GAPEEVGGGPGVGYNVNVAWTGGVDPF3 GDV 

HDAC4 OOAFYSDPSVl*YWSLHRYDDGNFFFGS- • GAFDEVGTGFGVGFNVNMAFTGGLDPPMGDA' 

HDAC7 OOTnf QDPSVLY 1 SLHRHDDGUFFPGS - - GATOEVGAGSGEGFNVNVAWAGGLDPPMGDP 

™AC3 EEAFYTTDRVMTVSFHRYG- - EYFPGT- - GDLRD3 GA GXGK YYAVNYP LRDGID DK 

™ AC2 EEAFYTTDRVWTVSFHXYG- - EYFPGT- - GDLRD3 GAGXGKYYAVNFPMCDG3D DE 

mAC3 OEAFYLTDR VWTVS FHX Y GN- YF FPGT - - GDMYEVGAESGRYYCLNVPLRDGID DQ 

HDAC8 EDAFSFTSKVMTVSLHRFSP - GFFPGT- - GDVSDVGLrGXGRYYSVNVF3 QDGIQ DE 

HDAC9 QY1.FEDDPSVLY FSWHRYEHGRFWPFLRESDADAVGRGQGl^FTVNLPWN QVGMGNA 



HDAC5 
HDAC6 
HDAC4 
HDAC7 



E YLTAFRTWMP 1 AHEF S PDW1/V SA GFDAVEGHL SPDGGY SVTAR CFGHLTR Q1MTXJU3 
EYLTAFRTVVOTIAHEFSPDVVLVSAGFDAVEGHLSPX^GySVTARCFGHl^TROlJTri^ 
EYI^FRTVVn^3ASEFAFDVVLVSSGFDAVEGHPTPlX*7ra>SAJ*CT 
EYLAAFRI VVMPIAREFSPDLVLVSAGFDAAEGHPAP1X3GYHVSAXCFG 
HDAC1 SyEAIFKPVMSKVKEMFOPSAWLOCGSDSl>£GD- -RbGCFHLTlKGHAXOVEFVKSFKI* 

HDAC2 SyGOirKPIISRVMEMYOPSAWtQCGADSl-SGD- -Kl^CFNLTVT^GHAKCVEVVKTFHI* 

£yKHLFQPVJNQWDFyQPTCIVl>OCGADSI,GCD- -IUA3CFNLS3RGHGECVEYVKSFWX 
1107108 KYYQ3 CESVLREVyOAFNPRAWXOLGADTIAGD- -PMCSFNMTPVGlGXCUOr3r*>W0I» 

DYVAAFLHLIjLPIaAFEFDP E1»VT»VS AGFDSA3 GD - -PEGQMQATPECFAJILTOLLQVLAG 
II l « * in. • *i i t „ i 

GRWLALEGGHDLTA2 CDA S EACVS ALLS VE LQ PLDEAVLQQKPNINAVATLEKVX 

HDAC6 GRW1ALEGGHDLTA3 CDA SEACVS ALLS VELO PLDEAVLQQKPNI NAVATtEKVJ 

HDAC4 GR3V1ALEGGHDLTA3 CDA S EA CVS AI*LGNELD PLPEKVLQORPNAKAVRSKEKVM 

HDAC7 GAWLALEGGHDLTA3 CDA S EA CVAALLGWVD PLSEEGWKOXPQP- ..... 

PMLMLG-GGGYTJ RNVARCRTYETAVALDTEJ PNEL- PYKDYFEYFGPDFKLHI SFSN-M 
^2 PLLMLG -GGGYT3 RNVAR CWTY ETA VA1-DCE3 PNEL - P VOTYFE YFGPDFXM J S PSK-M 

K °AC3 PLLVLG-GGGYTVRWARCWyETSLLVEEAaSEEL-pySKYFEYFAPDFTLHPDVSTM 
HDAC8 AT1*ILG» GGGYNLANTARCWTYLTGVI LGKTLJ5EBI - PDHEFPTAYGFDY*VT*EITFSC-R 

HDAC9 GRVC*VLEGGY«LESl^ESVCmVCmXGDPA*FLSGPMAPCW^ 

* **IS mm m I • 

™AC5 M OSJ^SCVQXFAAGl^RSI^EAOAGETEI^arrVSAMAXI^ 

HDAC6 El 0 SKHHS CV QXF AAGLGR S I*R EA 0A GETEEAE TV S AMALLS V G AE QA QAAAAREH S FRF 

HDAC4 E3 HSRYWRCXCiRTTSTAGRSU EAQT CENEEAE TVTAMAS LS VGVKPAEK RP 

HDAC7 QCBPLS GGRDPGAQ - 

HDAC1 TNQNTNEYLEX3 XQRLFENLKKLPHAPGVC>MOA3 PEDAIPEESGDEDEDDPDKRI S 3 CSfl 

KDAC2 • TNQKTPEYKEX3 XQRLFENLRMLPHAPGVgMgA 3 PEDAVHEDSGDEDGEDPDKR3 S3 HAS 

HDAC3 EN0NSRQYLDQ3 LQT2 FENLKMLNHAPSV03 HDVPADLLTYDRTDE - ... 

HDAC8 PDRKEPHR3 OQIXtHYJ XGNLXHW- - • - 

HDAC9 AJ^HWSLOQQDVTAVPMSPSSHSPEGRPPPLI^GGPVCXAAASAPSSLIJ^OPCXCPAPSV 



HDAC5 AEEPMEQEPAX . 

HDAC6 AEEPMEQEPAX .......... 

HDAC4 DEEPMEEEPPl* • r - 

HDAC7 - 1 

HDACl DM3ACEEEFSDSEEEGE(3GRja?SSNFIOC-AXRVKTEDEREICDPEERXEVTEEEKTTCB-* 

WAC2 DRR1 ACDEEF SDSEDEGE GGRRNVADHKXGAXXAR 2 EEDKXETEDKKTDVKEEDRSKDNfi 

^ACS -'-ADAEERGP- - EENYSKPEAPKEFYDGDHDND- - 
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HDAC8 
HDACS 



RTAVALTTPD1 T1>V1*PPDV3 OOEASA1jREETEAWARPHESLAREEALTALGKLLYIj1>DGM 



HDACS 
HDAC6 
HDAC4 
HDAC7 
HDAC1 
HDAC2 
HDAC3 
HDAC8 
. HDAC9 



- EKPEARGVKXEVKLA- - 
GEKTDTKGTKSZQ1-SNP- 
XESDVEI - » 



LDGQVNSGlAATPASAAAATLDVAVRRGLSHGXQRLIX^ALGOLDRPPDLAHDGRSLVfLH 



H&AC5 
HDAC6 
HDAC4 
-iDAC7 
HDAC1 
HDAC2 
HDAC3 
KDAC8 
HDACS 



I R GX EAAA1»SMFBVS TP LP VWTGG F 1»S C I LGLVXP LAY GFQPDLVLVALGP GH GLQGPHA 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
KDAC1 
HDAC2 
HDAC3 
KDAC8 
KDAC9 



ALl*AAMI^GI^GGRVl^Ll>EI^STP01*AGIIJ^VI^GEAPPSLGPSSVASPEDVOAIJm* 



HDAC5 
HDAC6 
HDAC4 
HDAC7 
KDAC1 
HDAC2 
HDAC3 
HDAC8 
HDAC9 



RGOLEPOWKMLQCHPHLVA 
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DO 



era 




Iji I 

pip 



> 

o 




CO 
CO 

c 

CD 



B 




o o 



cn 

0 o 

1 I 



mm 

AMP 



SBWI^%- i A' * ' 



Q 
> 

a 



> 
o 

CO 



7s 
cr 



Brain 
Colon 
Heart 
Kidney 

Liver 
Lung 

Placenta 

Sm. Intestine 

Spleen 
Stomach 

Testes 



d 

CO 
CO 

c 

CD 
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% of 18S rRNA level in A549 cells 



% of HDAC9 expression in 
A549 cells 



a 

(TO 
00 



<7\ 



brainl 
brain 2 
cerebellum 
spinal cord 
fetal brain 
trachea 
Liver 1 
Liver 2 
fetal liver 
stomach 
pancreas 
colon 
intestine 
kidney 
bone marrow 
spleen 
thymus 
thyroid 
adrenal gland 
salivary gland 
mammary gland 
skeletal muscle 
testis 

Prostate 1 
Prostate 2 

placenta 

H1299 
T24 
SJRH30 
SJSA-1 
Fibroblast 

A549 
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TV 9 



Sequence- ^foniiec-' *3' ^cet-son 
Sequence 14 HDAC4caCe f iyt-icdott*i*in 
Sequence. 24 HDACScetaTyticdowain 
Sequence* 3 4 H DA C-6 c eta'i yt; .i c don« 1 n 1 
Seque nee- 4^ -HDAC 6cat a i*yt-i cdo»ain2 
Sequence 5 : :HDAC7cate*ayt icdomaln 
Sequence S-z' -HDACSconip Jetepepxide 
Stefetr.-o*- : P-edr-vise. eli^njnerifcs 
Aa^rti n^v • 

Sequences (1*2} .m«mtf; Scorer 
Sequences (J^S"} .AU"iqnedi -Scores 
Sequences 11**4* Al-ignecr, .Score.; 
isquewces |i::S*) XlT-tfhetf; ;Scor-ei 
AM^ned- Scores 
Aliened:, Scorer 
A % l*lqh'exfe : Scor«i 
.A^igneiU Scores 
AM^ned% Score.: 



336' aa 
329 em 
302 aa 
4B1. act 
334- aa* 
673". aa 



Sequences 
-Sequence's 
Sequences 
Sequences- 
gue-nce's 



(2-T3) 
*2*4* 



7-8 
.44 
..45 
7S 
37 
•42. 
44 
72- 
3:7 
:49 
-4* 
55 
..4*' 

4a* 

3B 



17 b i o i'ri* nv/oolJt vare/b i obenchs w/xmp/all^nV 

366 ii.dncJfJ 



Sequences. C3?4i) .A^^netfi Score 
Sequences fS^sr^iagfas*. : Scor«4 
Sequences AWW** .S6o»fe.: 

Sequences i*:-SJ .A*l : iq*ne*Ii .Score: 
Sequences- t-4.t 6") Aligned* -Scorei 
Sequences- (5.:6$ Aligned .Soor-e.; 
Guide, tree : i*"i'e. -creetedi 

St-arvt Muat-i^lle. Al^wnent 
Tber* ctce. 5 groujfcs 
Al'igning. . • 
Group ll.sequewce»: 
: Group * 2 5 " Sequences i 
• Groufe Sequenced: 
Groui'.4:5 ..Sequenced 
Group 5": -Sequences.;' 
Alignment? .S6ote. rSOOO 

•GlillSTA'L-.Aldenwentr. '.*£ie ^created, l-jtei o*n*nv/soi»are/biobertcte 

GLDSrPAl, V: 1X4*1% tau&VS&l*. sequence 'alignment? SfSfr^.OktJ 



.2*. 
:.3- 
.4* 
-..■2% 
6 



Scores 6517 
Score :'<3?£> 
.Scor-er^BOl* 
.Scorers Z05 
Score t'4^95 
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)0>AC4cat aJLyli £ a amain 
HDAtScaCaayfi c domain 
HDA<C3cai;aly t'i c a amain 
HpAC6ca taay ti ctj cmain2 
KDAC6ca\ alyii cacmaini 
KDA C9 cample tepepii de 



JCDACicat aayticdomain 
KDA c a ffcJL y t i c 6 oma in 
HDAC7caVaayli cdomain 
}IDAC6cat^ay ti caomain2 
KDA C6 c a laly t i c a amainl 
KDAC9camplc t epcptl de 



HDAC4cat^iyticaa7tiain 
KDACScatalyti cdomain 
MDAC7cal*?ayi"J caomain 
. n>A'C6 ca't^yt i cdomaiiUf 
MDAC6ca iaay.ti caamain? 
HBAC9 cample I epepri de 



KDA04 c a 1 aly 1 i c a oniain 
KDA C 5 c a lalyl^cd amain 
KDAC7Cat''aJLyli cBcmain 
KDACSca t a3Ly t/S edema* n2 
HZ>A C6ca t -aly t i c a fema i n3 
HpAC9c amp! it 1 epeptri de 

KDAC4*Cja t -aJLyVS icdoroain 
imAOeaiAyt'S edema in 
*CDA'C7ca l^iy V5 cdomaih 
in5AC6.cat.aiy L*S '-c 'd ama in? 
"KDA C6 cai&lyt/i cd amain* 
KDA cs c ampie^jcp.ep-tide " 



«DA C^*c.* t-aLLy tl-c a amain 
HDAOc «H aiyt'S cd amaim 
KDA'C7 c.a t'aiyn."5c "d [ama tn 
?a>AC«.o-ai'-ai^t-iVd'amaln2 
WDAC6fc6t^yr5«a<^ 
" ME i ACS"c ampiVl-epep Vide 

"HDA C£ oa-t^aly 1 5 cd amain 
HDA*C5p>i^ly t'i c domain 
KDAC7.ca t aiyt><r<J amain 
HD A« ea* *iy i-5 cd oma in2 
"KDA;C6'calalytl^aamainl 
3 IDA CS compi e t ej> e p ti d e 



HDAC<4 ca-i*lytj5 cd amain 
»A;C3c*taaytWdamain 
TO>A C7<ra V-aiy t J cB amain 
KDA^ca^aiytVc domain? 
KDA C6ca i aXy W cdomai nl 
. lfDA<|9c(DnpXci.tpej>liidc 



U*riKPRrTTGl:yyDTtMU<MpCTCGSSSSfl3 

CyvyDirKLKHOO*OUllT 

TCL3 ypSVMLKHOCSCGDll 

cirwD ohmmnh arum's- - 

< VEDE gLTITErHClVDD S — 

MCTAiVYllEWllATRLXVDDPECt 



EHA GP3 p SPLJ0E T GL 
EHAGP3 0 STWSPXgETGi; 
CHA GK3.Q STH SWLflERGt: 

1*ECPEP3JHA1 KE QE3 X?E GL 
^ERPEJOlTAAXDRUROKGL 



» CKCEC3 P ITRKA T EEEL g TVHSEAH7 -lJL^CT JTPL1TPDKIJ> SKKULUStA 
l_^KCX'R3FGI^rijDE101VHSEraT-lXyGTSPl3mOKI^SKKl^ 
PS P CE CLP GHXA SI: EEL Q SVH S ERHV -ULYGTKPLSRliKLDK GKEA GEIA 
A CP CETE T I* RT A TEAELU'T CH SAX YV GKEPA TEKKKTPXEHRE 

LDRcvsr-QAjirArxcEuiLVitsixy — 

E OP 03IL;SAPXA SEEELGLVHSPEYy SliVPX 1 QVL^KEXX^AL 



1* 



S-VTVJaPCGGVGyT>SD73W £ /HSAGAAJRLAV GCVVElrVrKVA TGEEKH 

UXMYAVEPOCC3 CVDSPTW* 2 IHSSSJrVTOWVS*^^ 

D PJCT IMEP.eC GVGVDTD1TW11 C ^SSl'lAAJWAAGSyTDiaOfTWAST^ElM 

SSIIEPSJyi C >S TFA'CA g EA T GAA CPiVEAyi;5 GEVES 

ADTXDSVyi il > 11 SYS TAX^S CSVl^VDA^GA^ Ml 

gr^DA j yr i » s ttmcaplaa GAitityi) aye t gav o h 



GTAVVPJ?Pl^TniXEST] 
CrA3JJU>PC Oi JEES1J 
GTAyVPPPG O.i J5HSTI 
CAAVVPPP.C fMJ XpDAA 
OiA 3 3 PJPP C Oi * 
Gl^lEVPPPC CH ; 



acyni svavaaklx^ or — : fc5V SK.iiL?y»w 

bCI OrrWSVAi7AXLi^J)K r ^xiHVG^ V 
bGI CJXJlSVAlATOCjtiDOSKASKAS^ H £ 

hci crrHsvAyj^joiApTijs— GiiAi^aii 5 ir 

kG^OiTllHVAyAARYApgX — krjrrvLt n m 
^orrmiyAifiAj^tAxpx — ^,-Hta?ouii^ ji »w 



0y>O\ CKGTO OATY 5DP^vrXM SEKRXDD C J 

i p [Cj^y>?DP£yiJy.3 si^yiwci 
nolanGxiipjry QDPsvEyi sxmrhdi5gii 

OO GJ1CT0M)lFCTDP.SyiJyV^lJHJ^MGT 

m> g 0G7pr.Tr-D pDPsyx3nr s 1 hryeo 
70) g g G3 QXijr.r^DP.svi^smmx3CH€) 




GS— - GAPDEV G 7TGP GVG 
G^-GAlEEVCGEPOyGr 
CS»- GAVDEV GAGSGEG 
IGDEGASSg? GRAAGT© 
HLjKA- SKW S 7-7T GTG QG OJG' 
fl^jbspADAV GP I G QCLXT 



r-HVJWAr-T C GEDPPM GDAEYljAAJ^KTyVMPXASCTAJP^VVEySS G] 

-ywvwvAH^ CGvi>pp 3 CDVi^^AJ^j^TWiffiAkcrspDvyjLV 

r^HVlTV AV A GGEDP PM G DPEXlJiA^ JtTVVMP 1ARI2^P^EVEVSA tT 
r gVKVAHH G - -PPM GDAi)TiiAA*KKPiviiP 1A>CTotSvEVSa]g 

■y 7 ihvpvh.o ^ - -v gmkdadt iAJ^i^wxii|VAlCTin 




CKTTPLC GY1TL SAiR CFGytTK GtiA [CGRlVliAJjE G 
CKLSP L G GY-SV TAP* CPGHL TTtQLH TEA GGPVVl^Al^G 
GUPAPE'C GTHV SAX CrGXM T g Q LMJ JEft* G GAWEAli G 

XD-^PEGGc;gysPCGyAJtL^ , ki^4GEASbTaai:m 

GP-^PKCI>lA^TPAGrACLTHia^Cl^GGli3xi^ 
C7>--PEGCJWQATPEXTAJiL^gLa r 9y^ 




Iltak-?- 1 ^- 
i*A*<3>^-^- 




CTRSU-GDPPPLLTEPRPPI.5 GAEA52 TET J UV>iRPWRSEPVMKV 

TV.pTlXCDPAJ>PESCrMAP<:O^CEGSAXES3 0SA3^C^ 
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KDAncilftlyiicdoraain 
KDA C5 c a 1 aly t i cd omain 
HDAC7ca taly ti cd amain 
..KDA06cal«Lly 1 1 cdomai ji2 
.KDAC6ca t a3.y 11 cd omalnS 

MDA C5 c amp Acixpcpti de, 



ci rsu. gdpppla. rupnppus GALA S J*£T 1 WKRRWKSIjKVMKV 



TVQ ILL HDr APPL S CPMAP C-CR CE GSALCS) Q SARAA OAJ'.HtfKSL Q Q ODV 



KDA C< c a taiy 1 3 cd oaa i n 
KDACAcataJy IS cd amain 
JtDAOcaialyticdomaln 
KDA C6 ca t *ly \ 5 Cd cmai n2 
.KDA C6 ca t aly tied omainj 
JtDA C9 COmpJ r t-r jic p tide 



ED RE GV S S SKXV TKKAP QP JO<PJUjAEIW T TPXKKVVEA 6 - 



1 AVPH SPS 5HSPXC APPPLL.F CCPV CKAAASAF S 5LL* 0 P CL-CPAP SVRT 



J O) A C 4 c a t aly t i c d ama 1 n 
KDA CA c* t »ly i i cd amal n 
JfDA C7 ca I aly t i" c d ama 1 n 
KDA C6 catalytic domain? 
HDACScataly t i cd moinl 
JIDAC9cwiy)leUpcptide 



M CKVTSA STCEX5 TP.GQTHSE7A VVAL D QP SEAAT 6 GAT 

-AVAL T TP.D'3 TLVL-P.P DV 3 y QEA SAlrREET EAVAWKE Sl^APJEXAL' T AIlCR 



. .KDAC4cata2yticdamaln 
KDACaralaly ti c domain 
•MDA«7 ca t aly 1 1 c domain 
JlDAC6cat«Llyticdamai'n2 
.MDAC6cataiyticdamaini 
<KDAC9conqpir 1 cpcplldc 



2AQX4SEAJ13 G tUlMLC QTT SEEAV G lift TPJDQTT SXX-TV G GAIL- 



LLYIX© GMi-D GQVMSG1AA TPA SAAAA TlJJVAVjai CtSX GA 0r£l<VA1;* 



• MDA C 4 cat ally-t i cd amain 
-MDA cat aOy tied roaain 
/MDA C7 cat. *Xy 11 td amain ■ 
. MDA C 6 ca 1 a^lylJ c d ona) n9< 
.-KDAC6cat«ayti t domain! 
. KDACSco7Bpa<* cpcptltic 



CL3>RPPJ>LAKDGRS1-VL*3 PGKCAAAXSMTHV^TPLPVM TCCFLSCIkGt 



. KDA C « ca 1 *Qy l 1 C d oma in * 
•-MD A"C 3 ca 1 aOy t Jed oma In" 
» MDA C7ca t a4y 1 3 cd amain * 
• MDAC6cat «Oy ti *doma*n2 
' i KDAC 6ca 1 aJLy t i c d amal ntl 
/*MDA C 9c an^qLcrt e p cp tide 



WUY Gr^PDL-VXVALGP GHCU<J GPHAJUOJAAMU* CliA' GGRVliALXCEl 



. HDACicetftlytJtdoJMin- 
. KDA Meat aly t i Cdomi inf 
.MDAC7ca t aJLy t ic demain" 
/MDA C6cata3:y tied oma inS - 
.KDA C6ca\ aiy t $ Cd amain! 
.KDA C 9c ampl e 1 cp ep t 1 de 



S IP QLA'Gl LAJTVLOt GEAPPSk GPSSVA^PEDV QALMYU* G QfcEP qWKML^jL 



-MDA"C* ca t *4y 11 cd amain' 
.MDA C 5 ca 1 eOLy t led ama in" 
•KDAC7 ca t alyt icdomaln 
.MDACfccat o*yti c domains 
•KDA C6ca l«Oyti«cdamainl 
■ MP AC 9 c ompa el cp tp 1 1 tl c ' 



CMPHtVA 
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TV A ° 



Sequence' -torwat: is- Beer-stm 
Sequence .1: HPAClc6traJyc**ctio»ai*i 
Sequence *2*: HDAG2caxfe*lyca-cdDiDaln 
Sequence 3t HDAC3ca*talyt:icdoiBain 
Sequence -3: HDAC8c ax^-l yt-acdojnaln 
Sequence Ss HDAC9co*mpa*e*cepept"i<ie 
St/arc ox l?a"irwl : se <a*li-gnwent?s 
A14-gning . . • 



3 IG -aa 
310 'aa 
3 ID aa 
308 "aa 
673 aa 



Sequence's. (-1 

•Sequences: 
Sequences- (• 1*4*4) 
Sequence's** ("IrS") 
Sequences (2^*3*4 
Sequence's.' ("22 
Sequences" {"Z*:^ 
Sequences; (3-8*4.) 

•Sequences*. (3-:S9 



Aligned'; Score: 92 

A'll-g-ned'-* Scorer 65. 

Ai3<g*3ed*, 'Score-; -9-2* 

AlH-gnetiV Score: 2C 
A/Rtmecft * Score-: 

A**gnedV -Score": 33 

^a^nedV -Score: *-*2D 

A«^ne dV" Score*: SlZ 
A*i*gned* Score: 

M-9 



0/bio i n^ nv/sol t*war*eyia S-obe^c>>SM/x»p/^Mig|i/ 



Sequencers"; <4*-5*J AM^gned** 'Score? 
•GUJ-de *■«"€'€ 1 l*e- ere axed? 

•Sc^r-ti-cflS -MuitipJe AM-i^gnraerit. 
Therev-are*.*- groups 
A'lajgjxlng;, . • 

•Groiip" Xh Sequences* *2* 
*Gr-o*upv2-* Sequences? 3* 
•Group* 3-s Sequences* ,3- 
*Gro*up": -A* 
Sequencers' Sctit'te^-*23*?Z3* 
Al^nniettO* Score*. -633^ • m 

Cl* % USTAtV*A**i4'9nwent: **£i-l*e. ere axed" «C*/tolojin-f nv/ so*t:ware/b:r-o:benc^ 

CLUSXAt" (i ."81J. wultiplTe vtsrequence" -od^gnnwEWC: 56£i?,OCi^ 



•Scores' 6 62* 
-Score's 562-6* 
•Score*-:.4«*?D 
Delayed- 



20/29 

SUBSTITUTE SHEET (RULE 26) 



WO 02/050285 



PCT/EPO 1/14928 



HDAC3 cat aly Lied amain 
. HDA C3 caK-aXy 1 5 omain 
••HDAC3cat"a3.y t- 5 -c domain 
- HD A C 8 era t aly t- 5 c d oma i n 

HDAC9camj>a«et epejKide 



IlDAOcalalytScdomdin 
HDAC2calaly*-i cdomain 
)CDAC3cal aiy t.5 cdomain 
HDA C8 cat aly«-S cdoraain 
HDAC9comp3.e 1 cj>ej>U-de 



KDAC3»lalyl3Cdomaln 
.IGOACScral'alyt-S cdomain 
.HDAC3cat^3.y*-3cdomain 
•HDAC8 ca t aly t- i -c d oma i n 
•}fDAC9 ronq>a-ct' * P cj>t j -de 



MDAC3 era I aly t3 cdomain 
HDAC2cat aiy t-i cdomain 

•HDAC3cat aJLyt-i cdomain 
HDA C 8 cb t ieJLy t- 3 -c d oma i n 

•-HDA CScorapl el epepU 



• HDA CJ era t aiy H cd oma in 
HDA C3 ca t aJLy 1 3 -c d oma i n 
-HDAC3calaiyt. i -cdomain 

• HD A C 8 ca t aiy 1- 3 « d oma 4 n 
.•HDACScompa-e** epcjrti dc 

•HDA<acataiyt : i cdomain 
M>AC2cal«JLyl*5 caomain 
HDA<3cat«J.yV3 cdamaln 
HDA<*8cat a3.j^i"i c 6 oma in 
HDA C9 c omp3 c L cj> ep t i'de 



— cyyydgdvgh 

— -cyyydgd3 gh ^nryGgcHP^iKj? 

yTYDPDV Gil FHYGfiGHPHx}' 

— PVY3 5 S PEYVS M CDS LAX — 3 

K G TAXVYHEDH TA T JUXHDDPE CXI EF 



YYY G 0 GHFHK$?]tR 3 ItMTHHLLl-N? GLYJOOtE I YW'KKAHAEEM 



ou m Ttflnnm gi:yjikme3 ykpkka taejsm 

OUJO: TH S LVOfY^li YKKM JVFKP-Y Q^S <} H3DM 
CPJVSHVHSl-3 EAXAXKKnHRTVKPKV~A'SME3EM 
3U.TAAU) JOjR QP GU£ QRCLJObSAPEA STEEL 



TKyMSDDYJKTUlSJKl'DHMSEySKOHQRDTVG : )CPVF D GLFERCQfeS 5 GGSVA"SAV>C3L, 
TKYW5 D EY 3 KFXRS 1 HPD3M-SEY S K QUO 13 F1TV G 2 ) CPAJ'D GLFEFCQL5 T G GSVAtSAVJKL 
CRTHSEDY3 

ATI-HI PA Q) O-gKV S QE CPDDKPD S IE-' 
GLVH5 P E YV.S1.VRE 1 gVlL GXEEL OAX S G Q r PA 1 



DIXgHVSPTHHg GT TKSDIAT1TVG ) >CPVra€IJXFCSWr*GASl:^GAT QJJL 
-YGLG f )CPATEGirD5AAA3*GAT3TAAQ Ct 
fpiPStV — raCAJUUAJVGA GLi)tVDAV. 



31KggTDlAVllWAGC 
JTK g Q TDHAV1IHA G G- 
*3WX3Cl>3uA3)roAG6 
3DGMCKVA31WSCG 
L5-CAVgi1GLAJLVRPP 



IH) 



ocAAjocsi^^cr.ry\nn)Tsnjk3iixiJOfM — qkvly3 >Mj 

_H) lAKKYEA? GF ? YV JTD 3V1jA I UCL3-KYW- - g'KVLYa ) H : 

H) iakktea: gt :yv7td jv3 giijcli^oti---ppvi:y3 d n i 
oo lAKkdea^ cr ryxtn) a vi_ c i l^ujuuav- ejouyv y -j 



GgPAAAl GT CVn WVAXAAAMAK Q KH GLHRll^yV > f] 



— EY 



C > 



> H 



GVEEAJTY T 5 DFV1 1 1 V.SFHXy G- 

GVEEATYTTDin7MTVSr.HKYG 

GV gEATYlJTDPVM TV SrHJCYGll — 
GVEDATSrJl SKtfM TV-Sl-MXTSP — 
C3 gYlXEDDPSVLYI^WWYDiGTirVPi: 



GTGD U U>3G^CKGKYYAVHYPlJay©-JDE>3BS- 
EY^^CTCD Li U) 3 GLA^K dODCAVHTTM CD G-^ DX> JSS * 



-YT r 
-Gr " 



GTGD 
'J»GTGI) 



4 nCVJiATS GUYYtCUKVPUO) B- 3 DX> 
f DV GL GK GRYYS\a?VP3 gi>G- J^XKEOC- 
.V.CPGO CliCT-TViaLJWi gVGH GNAD 



YEA3 TKHVM SKVlIEKrgPSAWEgCGSDSliS CDW:GCn?l.T3 KGHAJ<eVE3fV3<Sr3a-3L-"3PM. 
Y G g 3 FKP 3 3 SKVUEMY gPSAWl.^ CGADSLS CDPL G CTWJJTVK GHA3<CVXVV3<TrH-3Ul»L; 

YKHLrgrva )i gwprY gp.TCjvi.-g cgad su6 cdplg ctnlsi p cm g3xvetvksi*»- X3pjl 

YY g 3 CESVIiKTVY OATUPKAWli QLGAD ? 1A"GDPM CSTHMTW G 3 -GK Cf3CT3Xi^jH^ ~3UA T 
yVAATXHlA^PLAriXDPr3:VL^SA GTT) SA3 CDPECgJl QA TPl^rAHtT gji UVJ-AtJ GRV , 

l^CLGGGtOTTWTyAJ^CKTYElAVAiJ) — -*~ 

13CLCGGGST3 RHVAJWM.TYETAVA13) w " 

XV1_C G GGHlVlUTyAIP CW TYETSOiVE " : 



- 3.3 L GG G raTUDl TAHO? TYL1 GVJL ' 

C^V>EGlBaJCSLAXSV Q4TV gTlXGDPAPPLS GFHXP C^R CXGSAXPS3 gj£*3^gJ&3P»! 



HDAC3C a t a4yl'3 : cd ama1 n 
\HDA' CJ ca laiy 1 S : «?d oma in 
HDA-C3 ca 1 aiyVi cd oma in 
HDAC8 c a VaAy t*c a oma in 
30)AC9conv3e 1 ejRCpM'de 



^LQg.0DVTAyPMSPSSH5PEGIU»PPiaJ»G6PVCKA^^ 



•HDAOcaialyl^i ca amain 
HDAC2 c a t aly li'ca amain 
3D>A<:3cat aajrl^Ca amain 
HDACBcataJytii caamacin 
:MDAC9campie t <j>.ej> ae 



VAL1TPD3 TL-VLPPDV.3 ggl^SA!JU:ETEAWA3U>liESlJaiIXJ^TipLG 



3iDAC3 ca I alyfi cd amain 
KDA C2ca 1 aly ti : cd amain 
HDACScataiyl^ caamai-n 
HDA<8 ca t aJ y l : i*c d oma in' 
HDACScomplet cj*.epti dc 



OVN S G3AA TP A SAAAA TLDVAVRP Gil Si I GA gRlX-tVAX GgLDl^PPDLAMD GRSLH1J3J IRG 



JB)A C3 ca t alyt i cd oma 
HDAC2 ca CaJy t "icd ama i n 
HDAC3cata3yVi cd ama in 
HDA-C8cala3yt : i cdoraain 
jn>AC9camjOe t trpcjifi dc 



KEAAA3.SHT)fVSTFlJHVin GGfLSC3 L GLVLrLAYGT QPDLVXVAX CP GMGL g GPHJV3)XL 
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KDAClcat alyti-cdoraaln : — • — 

HDA C2 cat aly I J c domain ~ « . • • 

KDACScalalytScdomain - — « * 

HW*0ecala3.yt*cdamaSn — - • 

KDACS rampl e I cpe p t i de KEAAALS>{TKVS1 PLPVH1 UCTLS C J L C1VLPLA Y UF Q PPLVL V AL CP U H CL Q UPHAAH 



HDAC3 co t aly t i cdomain 
cat a3.y 1 1 cdomaln 
HDA C3 ca I aly t 1 cdoma J n 
KDAC8 cat *ayt 1 cdomaln 
W>AC9camp3e*€peptidc 



AAMUHnJ\GCPVUOJ-tE«S7 PC LA C3 LAWUIGCAPPSl-CrSSVASPCDVQAUmJlCg 



KDA C J c a t aly 1 1 cdoma 1 n 
HDA C? era I aly t J cdoma In 
HDAC3catalyt 4 cdomaln 
HDA CB ca t aly t J cdoma in 
JCDACSecunpleteptptloe 



t£ P QtfXKL Q C MP HtVA 
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Fig^ 11A 

HDAC9vl MGTALVYH E DMTATRLLWDDPECEI ERPERi/TAALDRLRQRGLEQRCLRLSAREAS EEEI* 
HDAC9v2 MGTALVYHEDMTATRLLWDDPECEIERPERLTAALDRLRQRGLEQRCLRLSAREASEEEL 
HDAC9v3 MGTALVYHEDMTATRLLWDDPECEIERPERLTAALDRLRQRGLEQRCLRLSAREASEEEL 
******************* + •* *************************************** 

HDAC9vl GLVH S PE YVSLVRETQVIiGKEELQALSGQFDAI YFH PST FHCARLAAGAGLQLVDAVLTG 
HDAC9V2 GLVHSPEYVSLVRETQVLGKEELQALSGQFDAIYFHPSTFHCARLAAGAGLQLVDAVLTG 
HDAC9V3 GLVHSPEYVSLVRETQVLGKEELQALSGQFDAIYFHPSTFHCARLAAGAGLQLVDAVLTG 
************************************************************* 

HDAC9vl AVQNGI^VRPPGHHGQRAAANGFCVFNNVAIAAAHAKQKHGLHRIJ,VVDWDVHHGQGIQ 
HDAC9v2 AVQNGI^VRPPGHHGQRAAANGFCVFNNVAIAAAHAKQKHGLHRILVVDWDVHHGQGIQ 
HDAC9v3 AVQNGIALVRPPGHHGQRAAANGFCVFNNVAIAAAHAKQKHGLHRILVVDWDVHHGQGIQ 
************+**+******+*********+***********************+*** 

HDAC9vl YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGLGFTVNLPWNQVGMGNADYVA 
HDAC9v2 YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGLGFTVNLPWNQVGMGHADYVA 

HDAC9v3 YLFEDDPSVLYFSWHRYEHGRFWPFLRESDADAVGRGQGLGFTVNLPWN 

A************************************************ 

HDAC9vl AFLHLLLPl^FEFDPELVLVSAGFDSAIGDPEGQMQATPECFAHLTQLLQVIAGGRVCAV 
HDAC9v2 AFLHLLLPLAFEFDPELVLVS AG FDS AI GDPEGQMQAT PEC FAHLTQLLQVLAGGRVCAV 

HDAC9v3 QFD PELVLV S AGFDS AI G DPEGQMQAT PECFAHI/TQLLQVLAGGRVCAV 

• .*****★****************************************** 

HDAC9vl LEGGYHLESLAESVCMTVQTLLGDPAPPLSGPMAPCQRCEGSALESIQSARAAQAPHWKS 
HDAC9v2 LEGGYHLESLAESVCMTVQTLLGDPAPPLSGPMAPCQRCEGSALESIQSARAAQAPHWKS 
HDAC9v3 LEGGYHLESLAESVCMTVQTLLGDPAPPLSGPMAPCQRCEGSALESIQSARAAQAPHWKS 
+**★+******************************************************* 

HDAC9vl LQQQDVTAVPMSPSSHSPEGRPPPLLPGGPVCKAAASAPSSLLDQPCLCPAPSVRTAVAL 
HDAC9v2 LQQQDVTAVPMS PSSHSPEGRPPPLLPGGPVCKAAASAPSSLLDQPCLC PAPS VRT AVAL 
HDAC9v3 LQQQDVTAVPMSPSSHSPEGRPPPLIiPGGPVCKAAASAPSSLLDQPCLCPAPSVRTAVAL 
************************************************************ 

HDAC9vl TTPDITLVLPPDVIQQEA 

HDAC9V2 TTPDITLVLPPDVIQQEASALREETEAWARPHESLAREEALTALGKLLYLLE)GMLDGQVN 
HDAC9v3 XTPDITLVLPPDVIQQEASALREETEAWARPHESLAREEALTALGKLLYLLDGMLDGQVN 
****************** 

HDAC9vl ~ 

HDAC9V2 SG I AAT PASAAAATLDVAVRRGLS HGAQRLLCVALGQL DRPP DLAH DGRSLWLN I RGKEA 
HDAC9V3- SGIAATPASAAAATLDVAVRRGLSHGAQSWGVGEGLLEAMPGGSPAQRLSSHSTPAHGPV 

HDAC9vl - ' CI LGLVLPLAYG FQP DLVLVALG PGHGLQG PHAALLAAM 

HDAC9V2 AALSMFHVSTPLPVMTGGFLSCILGLVLPLAYGFQPDLVLVALGPGHGLQGPHAALLAAM 
HDAC9 v3 NALPPLPLRFGLRRMTGGFLSCI LGLVLPLAYGFQPDLVLVALGPGHGCRAPTLHSWLQC 

*************************** : .+ 

HDAC9vl LRGLAGGRVLALLE ENS T PQLAGI LARVLNGEAP P S LGLSSVAS PEDVQALMYLRGQLEP 
HDAC9V2 LRGLAGGRVLALLEEVSWAGWR — CCGVGRGKGP — VTASVFAPGPELHTPASRDPGPGA 

HDAC9V3 FGGWQG AESWPSWR RGRPGP YVPERAAGASVEDVAVPSS PGGLKSA 

:***..*:*.::.. 

HDAC9vl QWKMLQCHPHLVA 

HDAC9v2 EWRGTS 

HDAC9V3 K 
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Fig. 14. 

SEQ ID NO: 7 

>HDAC9v2 DNA sequence 

1 ATGGGGACCGCGCTTGTGTAC 
61 CCCGAGTGCGAGATCX^GCGTCCTGAGCGCCTGAC 

121 CGCGGC CTGGAACAGAGGTGTCTGCGGTTGT CAGC CCGCGAGGCCTCGGAAGAGGAGCTG 

181 GGCCTGGTGCACAGC C CAGAGTATGTATC C CTGGTCAGGGAGACCCAGGTCCTAGGCAAG 

241 GAGGAGCTGCAGGCGCTGTCCGGACAGTTC 

301 CACTGCX3CGCGGCTGGCCGCAGGGGCTGGACT 

361 GCTGTGCAAAATGGGCTTGCCCTGGTGAGGC^ 

421 GCCAACGGGTTCTGTGTGTTGAAG 

481 CACGK3GCTACACAGGATCCTCGTC 

541 TATOTCTTTGAGGATGACCCCAGCGTCCTT^ 

601 CGCTTCTGGCCTTTCCTGC^ 

661 GGCTTCACTCTCAACCTGCCCrGGAACCAG^ 

721 GCCTTCCTGCACCTGCTGCTCCCACT^ 

781 TCGGCAGGATTTGACTC^GCGATCGGGGACCCTGAGG^ 

841 TGCTTCGCCC^CCTCACAC^ 

901 CTGGAGGGCGGCTACCACCTGGAGTCACTGGC 

961 CTGCTGGGTGACCCGGCCCCACCCCTC^ 
1021 GGGAGTGCCCTAGAGTCCATCCAGAGTGCCCGTGCTGCCCAGGCCCCGCACT 
1081 CTCCAGCAGCAAGATGTGACCX3CTGTGCCGATGAGCCCCAG(^ 
1141 AGGCCTCCACCTCTGCTGCCTGGGGGTC 
1201 TCCCTCCTGGACCAGCOSTGCCTC^ 

1261 ACAACGCCGGATATCACATTGGTTCTGCCCCCTGACGTCATCCA^ 

1321 CTGAGGGAGGAGACAGAAGCCTGGK^ 

1381 CTCACTGCACTTGGGAAGCTCCTG 

1441 AGTGGTATAGCAGCCACTCCAGCCTCTGCTC 

1501 AGAGGCCTGTCCCACGGAGCCCAGAGGCTGCTGTGCGTGGCCCTGGGACAGCTGGACCG 

1561 CCTCCAGACCTCGCCCATGACGGGAGGAGTCTGTGGC^ 

1621 GCTGCCCTATCCATGTTCC^TGTCTCCACGCCAOTGCCAGTGATGACCGG 

1681 AGCTGCATCTTGGGCTTGGTGCTGCCCCTGGCCTATGGCTTCCA 

1741 GTGGCGCTGGGGCCTGGCCATGGCCTGCAGGGCCCCCACGCT^ 

1801 CTTCGGGGGCTGGCAGGGGGCCGAGTCCTGGCCOT 

1861 TGGAGGTGCTGCGGGGTGGGACGAGGGGAAGGACCAGTGACTGCTO 

1921 GGTCCAGAACTCCACACCCCAGCTAGC^ 

1981 ACCTCCTAGCCTAGGCCTTTCCTCTGTGGCCTC 

2 041 CCTGAGAGGGCAGCTGGAGCCTCAGTGGAAGATGTTGCAGTG 

2101 TTGA 

SEQ ID NO: 5 

>HDAC9v2 peptide sequence 

1 MGTALVYHEDMTATRLLWDDPECEIERPERLTAA 
6 1 GLVHSPEYVSLVRETQVLGKEELQALSGQFDAIYFH^ 
121 AVQNGLALVRP PGHHGQRAAANGFCVFRNVAI AAAHAKQKHGLHRI I Q 

181 YLFEDDPSVLYFSWHRYEHGRFWPFLRESDA 
241 AFLHLLLPIiAFEFDPELVLVSAGFDSAIGDPEGQMQAT^ 
301 LEGGYHLESIiAESVCMTVQTLLGDPAPPLSGP^ 

361 LQQQDVTAVPMSPSSHSPEGRPPPLLPGGPVCKAAASAPSSLIjDQPCLCPAPSVRTAVAL 

421 TTPDITLVLPPDVIQQEASALREETEAWARPHESLAREEALTAIjGKIiL 

481 SGIAATPASAAAATIJDVAVRRGLSHGAQRLLCVALGQIjDRP PDLAHDGRSL WLNI RGKEA 
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541 AALSMFHVSTPLPVMTGGFLSCIIXSIiV^ 

601 LRGLAGGRVLALLEEVSWAGWRCCGVGRGKGPVT^ 

661 TS 

SEQ ZD NO: 8 

>HDAC9v3 DNA sequence 

1 ATGGGGACCGCXSCTTGTGTACCAT^ 
61 CCCGAGTGCGAGATO^GCGTCCTGAGCGCCTO 

121 CGCGGCCTGGAACAGAGGTGTCTG CX3GTTGTCAGC CCGCGAGGC CTCGGAAGAGGAGCTG 

181 GGCCTGGTGCACAGCC CAGAGTATGTATC CCTGGT CAGGGAGAC C CAGGT C CTAGGCAAG 

241 GAGGAGCIN3CAGGCX3CTGTCra 

301 CACTGCGCGCGGCTGGCCGCAGGGGCTGGACTC 

361 GCTGTGCAAAATGGGCTTGCCCTGGTGAGGCCT 

421 GCCAACGGGTTCTGCGTGTTCAACAACGTGGCC^ 

481 CACGGGCTACACAGGATCCTCX5TCX3TGGACT 

541 TATCTCTTTGAGGATGACCCCAGCGTCCTTTACTTCT 

601 CGCTTCTGGCCTTTTCCTGCGAGAGTCAGATGC^ 

661 GGCTTCACTGTCAACCTGCCCTGGAAC 

721 GGATTTGACTCAGCCATCGGGGACCCTGAGGGGCAAATC 

781 GCCCACCTCACACAGCTGCTGCAGGTGCTGGCCGGCGGCCGGGTCTGTO 

841 GGCGGCTACCACCTGGAGTCACTGGCGGAGTCAGTGTGCATGACAGTAC^ 

901 GGTGACCCXSGCCCCACCCCTGTCAGGGCCAAT^ 

961 GCCCTAGAGTCC3VTCCAGAGTGCCCGTGCTC 
1021 C^GCAAGATGTGACCGCTGTGCCGATGAGCC^ 
1081 CCACCTCTGCTGCCTGGGGGTCCAGTGTGTAA 

1141 CTGGACCAGCCGTGCCTCTGCCCCGCACCCTCTGTCCGCACCGCTGTTGCCCTGACAACG 

1201 CCGGATATCACATTGGTTCTGCCCCCTGACGTCAT 

1261 GAGGAGACAGAAGCCTGGGCCAGGCCACACGAGTCCCT 

1321 GCACTTGGGAAGCTCCTGTACCTCTTAGATGGGATGCTGGATGGGC^ 

1381 ATAGGAGCCACTCCAGCCTCTGCTGCAGCAGCCACCCTGGATGTGGCT 

1441 CTGTCCCACGGAGCCCAG^GCTGGGGTGTGGGAGAAGGGCTGCTGGAGGCAA 

1501 GGGTCTCCAGCACAGAGGCTCAGCAGTCACAGCACCCCTGCCG^ CGTGAATGCT 

1561 CTTCCACCTCTGCCTCTGCGGTTTGGGCT 

1621 ATCTTGGGCTTGGTGCTGCCCCTGGCCT^ 

1681 CTGGGGCCTGGCCATGGCTGCAGGGCCC 

1741 GGCTGGCAGGGGGCCGAGTCCTGGCCCTCCTGGAGGAGAGGACGTCCAGGCCCTTATGTA 
1801 CCTGAGAGGGCAGCTGGAGC CTCAGTGGAAGATGTTGCAGTGC CAT C CTCACCTGGTGGC 
1861 TTGAAATCGGCCAAG 

SEQ XD NO: 6 

>HDAC9v3 peptide sequence 
1 MGTALVYHEDMTATRI^ 
6 1 GLVHSPEYVSLVRETQVLGKEELQAIiSGQFDAIYFHPSTra 
121 AVQNGIiALVRPPGHHGQRAAANGFCVFNNVAIAAAHAKQra 
181 YLFEDDPSVLYFSWHRYEHGRFWPFLRESD 
241 GFDSAIGDPEGQMQATPECFAHLTQLLQVLAGGRVCAVLE^^ 

3 01 GDPAPPLSGPMAP CQRCEGS ALE S I QS ARAAQAPHWKSLQQQDVTAVPMSP S SHSPEGRP 
3 61 PPIiLPGGPVCKAAASAPSSIiLDQPCLCPAPSWTAVALTTPDITLVIjPPDVIQQEASAIjR 
421 EETEAWARPHESIiAREEALTALGKLLYIiLDGMIJ^ 

481 LSHGAQSWGVGEGLLEAMPGGSPAQRLSSHSTPAHGPVNALPPLPLRFGLRRMTGGFL 
541 ILGLVLPIiAYGFQPDLVLVALGPGHGCRAPTLHS WLQCFGGWQGAESWPSWRRGRPGPYV 
601 PERAAGAS VEDVAVP S S PGGLKS AK 
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